Google's Slow Fade with Librarians
February 2, 2015 1:14 PM   Subscribe

 
Beautiful.

Sad, but beautiful.
posted by meinvt at 1:24 PM on February 2, 2015 [2 favorites]


I don't know all that much about how public domain works, but I know enough to be troubled by their terms here.

Is it weird that they tell you not to remove the Google watermark from public domain books? There's a watermark on every single page of this one.
posted by teponaztli at 1:35 PM on February 2, 2015 [2 favorites]


Not really? Google owns the photos even if the item itself is in public domain.
posted by tavella at 1:39 PM on February 2, 2015 [3 favorites]


The piece it was written in response to, Never trust a corporation to do a library's job, is by MeFi's own waxpancake.
posted by metaquarry at 1:40 PM on February 2, 2015 [16 favorites]


There is an interesting project called Public Library at memoryoftheworld.org.
posted by jeffburdges at 1:42 PM on February 2, 2015


So the item itself is public domain, but Google can ask that you not do certain things with it because they own these images of it?

My experience with scans like this has basically been limited to sheet music, and imslp.org doesn't have the same terms, nor do they claim ownership over the images - I guess because the scans are created by users, not the site itself?

At this point it's not even about hating Google, I'm just trying to understand how it all works here.
posted by teponaztli at 1:46 PM on February 2, 2015


DTMFA
posted by 724A at 1:52 PM on February 2, 2015 [7 favorites]


Google's romance with _________ is over.
posted by freebird at 1:54 PM on February 2, 2015 [13 favorites]


Google can ask that you not do certain things

Google can ask for whatever they want. Doesn't necessarily mean anything legally.
posted by ryanrs at 1:56 PM on February 2, 2015 [4 favorites]


Google's romance with _________ is over.

I wonder how their romance with humanity is doing these days? If the answer is not too well, their purchase of military robotics firms takes on a sinister tone. Let's hope they still have a soft spot for us erratic meatsacks.
posted by acb at 2:09 PM on February 2, 2015 [5 favorites]


Metafilter: a sinister tone. Let's hope they still have a soft spot for us erratic meatsacks.
posted by lalochezia at 2:11 PM on February 2, 2015 [2 favorites]


I read Jessamyn's response before I read the other piece. By MeFi's own waxpancake, eh? Wish I'd known. Good piece, waxpancake! Thanks for posting that, metaquarry!
posted by Bella Donna at 2:11 PM on February 2, 2015 [1 favorite]


At least there were no kids involved in the divorce.
posted by dances_with_sneetches at 2:11 PM on February 2, 2015 [1 favorite]


Somewhat related (internet archiving), Chilling Effects got tired of coping with the onslaught of DMCA requests and chose to de-index notifications. Yay, teh Google
posted by bolix at 2:13 PM on February 2, 2015 [1 favorite]


I work in library software, primarily on so-called ILSs, or integrated library systems.

The ILS is an odd combination of an inventory management system (keeping track of the library's stuff), a customer relationship management system (keeping track of the library's patrons and what they request and check out), and an imperfect repository of some of the library's unique intellectual contributions to humanity: the records that not only describe the books, videos, maps, and everything else that the library collects, but how they are related to each other via topic, authorship, and dozens of other attributes. Modern ILSs do quite a few other things, accreting functionality over time.

As you might expect some such a laundry list of purposes, every ILS I know of is a more-or-less patchy thing, using spit and baling wire to join functions that should perhaps should not be joined. That probably describes most software running complicated institutions, of course.

For a long time, there's been a thread of uncertainty? self-doubt? among many participating in library software development about how well we what we write actually serves libraries. After all, if you want to find something nowadays, do you think to go to your library first? Or do you just type into the simple search box kindly provided by Google and get whatever Google provides?

And if most people do that, are libraries left dancing to the tune that Google (or Yahoo, or Bing, or in the longer term, whatever succeeds Google) plays? If so, how does that square with library goals of being a long memory and serving all who stop by, equally, for free?

And there's no question that libraries have a lot of technological tricks to learn from the likes of Google. Library software is saddled with dealing with an antiquated metadata format, MARC, and the library profession can be both extremely forward-looking (MARC was created long before many other industries were even thinking of structured metadata) and conservative (the profession's discussion on what will succeed MARC have taken years, with no end really in sight).

The idea of ditching that all, and running with how the big, successful commercial players do their thing, can be seductive. Why write a library catalog search engine if you can Google things (and in fact, since 2003, selected library records have been sent directly to Google)? Why not crib ideas from Amazon's bookstore?

And I think some of that is fine: library software is a niche, and the entire set of developers employed by libraries and nonprofit and for-profit organizations serving libraries is dwarfed by the armies that Google can muster. If one stays in the niche, it's easy to miss things.

On the other hand... Google is not forever. Not that I expect the company to disappear any time soon — but it clearly has and will continue to change its focus. And when a focus shift occurs: who will remember the past?
posted by metaquarry at 2:14 PM on February 2, 2015 [35 favorites]


Google is not forever.

In fact, they're quite infamous for abandoning projects and services, even popular and well-used ones, for opaque corporate reasons.
posted by ryanrs at 2:20 PM on February 2, 2015 [18 favorites]


Google Books used to be a tremendous resource. I remember researching the etymology of phrases and finding early (19th, 18th century) publication records in GB. When I've tried that recently, I get almost nothing in terms of historical results (and of course problem isn't just Google but publishers and the whole twisted state of intellectual property in the US).

I took a graduate-level bibliography and research methods course just before Google Books started its nose-dive. We met in the Harry Ransom Center at the University of Texas. Our professor gave us a series of exercises asking how we would go about locating certain information. One question dealt with identifying booksellers and their stocks in 19th century Austin as well as other material related to the regional book trade during the period.

Some catalogs & ephemera dealing with this information were held (and cataloged) at various libraries in Austin, and many of the students located these catalog listings. Most everyone also noted that Google Books contained full scans of city directories and bookseller advertisements. Now some of that material is available in hard copy, but the potential of the Google Books project was apparent in terms of full-text searching & full document image access. These benefits were especially helpful for place-bound, unfunded researchers who can't travel to and spend days sifting through material at an archive.

Other databases exist with online access to similar historical material (and your local public library may or may not have access to them), but the breadth of Google Books coverage and facility of search were both quite useful. My go-to resources now are archive.org's texts collection and the HathiTrust, though I still find some items in Google Book searches too (e.g., I was recently interested in the history of the office supplies trade in the US and found The American Stationer where GB and HathiTrust seem to hold the most volumes vs. archive.org due, I think, to different sources, in this case NYPL vs. Smithsonian).
posted by audi alteram partem at 2:36 PM on February 2, 2015 [13 favorites]


People just need to accept that Google is going to focus on its core businesses like self driving cars and airship-based internet, or at least that's what I read on Wave.

Relying on Google for anything long term is a very dangerous thing to do, but that's true of pretty much any business. This is why we used to establish public institutions.
posted by LastOfHisKind at 2:41 PM on February 2, 2015 [32 favorites]


Never trust a corporation.

(fixed that for you)
posted by sammyo at 2:43 PM on February 2, 2015 [4 favorites]


By MeFi's own waxpancake, eh? Wish I'd known. Good piece, waxpancake! Thanks for posting that, metaquarry!

Yeah Andy's article was great and I found myself sending him links "And over HERE is where Google did THIS and said THAT and then LATER did THIS OTHER THING" and I realized I should just write it up in a short piece of my own.

And at some level it's not that surprising that Google is like this. They are a company. They shift in their priorities and that's to be expected somewhat. But what is always annoying is how they try to partner with all sorts of different people and be all "Hey we can help you with this certain aspect of what you're doing. We are big and rich and smart. Let's talk" and then they just ...slow fade like I said.

Like I don't think Google actively was like "Fuck this, libraries suck" but that they saw an opportunity, worked on it, it got complicated (years and years of lawsuits and FOIAble contracts thanks to some of the libraries being public entities) and they were no longer the only people doing this and the complicated parts were a lot less fun than just going off and doing something else like (on preview) fun self-driving cars.

Because doing things for the public, with the rules that actually govern The Public is a little challenging and difficult. And breezy tech disrupters don't like to actually figure out how to serve the "last mile" people, whoever they are. The public library serves them all (nearly all) and it's less shiny but one could argue has more utility in the long haul.

Thanks metaquarry, that was really interesting.
posted by jessamyn at 2:44 PM on February 2, 2015 [59 favorites]


In fact, they're quite infamous for abandoning projects and services, even popular and well-used ones, for opaque corporate reasons.

I'd love to know where you get your data on the usage of Google's services.
posted by GuyZero at 2:52 PM on February 2, 2015 [1 favorite]


GoogleBooks has been essential for my research--I even thank the GoogleBooks scanners in my second book's acknowledgments!--but its treatment of out-of-copyright books has never made any sense. Like, ever. Books available to view via archive.org or HathiTrust in full text cannot be seen at all in GoogleBooks, even though GoogleBooks did the original scans. Quality control is non-existent--many books have illegible pages, pages with the notorious fingers, and so on. Moreover, books once available suddenly vanish without a trace, in ways that conform to no immediately visible pattern (e.g., there doesn't seem to be any connection between a book disappearing and one of those predatory publishers scraping the content and putting a copyright on it). And the ability to search your own library vanished some years back. Just incredibly frustrating all the way around.
posted by thomas j wise at 2:54 PM on February 2, 2015 [5 favorites]


The "digitize and index all the (old) things" was the one thing Google was doing that I was really enthusiastic about. Thank God for the CDNC.
posted by entropicamericana at 3:10 PM on February 2, 2015 [1 favorite]


Google's mission to "organize the world's information and make it universally accessible and useful" (which Larry seems to be backing away from) seems like a natural fit for partnering with libraries.

The problem is that its corporate goals and short-term agendas are incompatible with a mission that involves the long-term archiving and accessibility of information. Google's interests are always served by the collection of more information, but its interests are not always served by making that information available to users, certainly not always in the formats that users want.

Google needs to decide what it wants to be for the next decade and needs to clearly communicate that to its users, partners, and investors. We all rely on it too much to not have have some idea where its going and what it is leaving behind.
posted by zachlipton at 3:11 PM on February 2, 2015


I'd love to know where you get your data on the usage of Google's services.

Were you around here when THEY KILLED GOOGLE READER (shaking with rage my god I'm still not over it)
posted by JHarris at 3:12 PM on February 2, 2015 [38 favorites]


Hey, remember when Google was all about not being evil?

Of course you don't. And neither does Google.
posted by tommasz at 3:22 PM on February 2, 2015


Were you around here when THEY KILLED GOOGLE READER (shaking with rage my god I'm still not over it)

I think you and Google have different definitions for popular and well used. Especially on the scale of Google. That's at least what I got from the thread on Metafilter about reader.
posted by zabuni at 3:23 PM on February 2, 2015 [6 favorites]


Google's most popular function is probably technically illiterate people using it tell their computers to please go to Yahoo.com Facebook.com.
posted by entropicamericana at 3:28 PM on February 2, 2015 [5 favorites]


I'd love to know where you get your data on the usage of Google's services.

All the people that complained when stuff disappeared? Also I worked there.
posted by ryanrs at 3:43 PM on February 2, 2015 [12 favorites]


Well done.

I guess I'd been burnt by my previous bad relationships. Sony embedding DRM on their CDs. Apple making products that were designed to fail after a certain number of years so you would be forced to go and buy a new one. Windows anything.

So when Google started courting me, I was skeptical. I didn't trust all those romantic gestures and sweet nothings whispered seductively in my ear. While other librarians around me were enthusiastic about Google snapping up digitised newspapers and hoping eagerly for more, I was glumly predicting an eventual pay-per-view scenario. (Okay I was wrong; they've just buried them in obscurity.) Others raved about all the books you could read online and I cynically predicted that students would just use the search function to find a particular quote and take it completely out of context because they couldn't be bothered reading the whole book.

I admit, sometimes I succumbed. A candlelit lobster dinner is a candlelit lobster dinner, after all, and if I could exploit Google's starry-eyed efforts for my own or my users' benefit, of course I would! But I knew there would be a price to pay. I just didn't realise it would be indifference rather than exploitation.
posted by Athanassiel at 3:43 PM on February 2, 2015 [4 favorites]


It's not that I think Google has an obligation to maintain Books or Reader or anything else. Clearly when something no longer fits their vision, they don't have to keep doing it, and that's fine.

But if they're going to build stuff that doesn't last, then I wish they would build it in such a way that they could sell off that segment of the business, or open up the code to the public, or otherwise let someone else continue where they left off.

It's that pattern of building cool things, and then just letting them fizzle in such a way that no one can salvage the pieces that is really disappointing.

When I have a choice, I don't use their products (besides the search engine) for this reason. I always try to support whatever the alternative to Google is for a given purpose, because odds are the alternative will be around longer.

If Google wants to be a short-term serial monogamist, fine, no judgment. But I'm looking for a life partner, and we are just not a good match.
posted by Bentobox Humperdinck at 3:48 PM on February 2, 2015 [19 favorites]


Google also killed their old newspaper archive. It still comes up in search results (sort of) but you can't just go to one page and search the whole archive.
posted by interplanetjanet at 3:55 PM on February 2, 2015 [3 favorites]


I enjoyed the line [w]e ... brushed up on the five love languages, because I remember the early phase when they'd decided to only scan books in EFIGS -- English, French, Italian, German, and Spanish -- or as they put it, "the Romance languages."
posted by uosuaq at 3:59 PM on February 2, 2015 [6 favorites]


zabuni: I think you and Google have different definitions for popular and well used. Especially on the scale of Google. That's at least what I got from the thread on Metafilter about reader.

We might. My definition is based on what I see with my eyes, and use every day, and what a ton of other people said in that thread. If someone takes something I rely upon away from me I'm going to get pissed, regardless of how much crap they, and their defenders, shovel onto my plate about who else doesn't use it, because I'm not them, and I have a responsibility to my own perspective -- if I don't stick up for me, who will? I've done a ton of self-negation in my life and there's a time to put it away.

What was that, some of you might be thinking? If I didn't pay for it I should expect this? Doesn't fly. Because of course I paid for it. I paid for it in good will. While Google Reader was flying, I couldn't really hate them all that much, because hey, every day I was using a prime example of them at their best. Well, it's not here anymore, and now I can't even point to it as an example to give people, something I can say "yes, that evil stuff might be true, but there's still this." There's not a lot of "still this"es left.
posted by JHarris at 4:20 PM on February 2, 2015 [2 favorites]


To see how it should be done, check out Trove from the National Library of Australia! I've found it to be invaluable.
posted by orrnyereg at 4:22 PM on February 2, 2015 [24 favorites]


Trove is amazing. Every time I look at it I weep for what digital content archives COULD be.
posted by jessamyn at 4:28 PM on February 2, 2015 [5 favorites]


In fact, they're quite infamous for abandoning projects and services, even popular and well-used ones, for opaque corporate reasons.

I don't think the reasons are really that opaque at all, though. I think they boil down to one reason: can they effectively make money off of something? The answer to this question may change over time, because of changing overhead, new managers coming to different conclusions than their predecessors, etc. But I don't think this is opaque. It's what we should expect from any for-profit business.

I like Google, and I use a lot of Google products. But I don't want to have any illusions about my relationship with a giant moneymaking concern. As long as our interests coincide, things are good for me. The minute they're not, they're going to kick me to the curb. I think, based on the nature of Google's overall business, their interests are more likely to coincide with mine than, say, Apple or Microsoft. But that doesn't make them better or worse, just ... different.
posted by me & my monkey at 4:33 PM on February 2, 2015


interplanetjanet, though the search doesn't work (at least I can't get it to) you can get the complete list of archived digitised newspapers and browse through them by date. You can also link to individual newspaper titles that way, which is what we've done for the ones most relevant for our patrons. /ex-newspaper librarian
posted by Athanassiel at 5:08 PM on February 2, 2015 [2 favorites]


Is there any way to search a range of dates, though? That's what I haven't been able to discover a substitute for—if I put in a custom range of pre-internet dates, I get no results, and so for example if I'm specifically looking for something from the 19th century most of my search results are frequently just chaff I have to painstakingly pick through.

I sometimes throw in additional search terms I would only expect to find in a 19th-century newspaper, but even when that works who knows how many hits I'm missing out on...
posted by XMLicious at 5:24 PM on February 2, 2015 [1 favorite]


Sssh, orrnyereg, ixnay on the Ovetray. if we don't talk about it then no-one will try to privatise or off-shore it. (One of my favourite notes in the Wikipedia page on Trove is "The reach of the newspaper archives makes the service attractive to genealogists and knitters.")

Back to Google, they love new, they love innovative and, as noted, this can mean that projects that become more difficult suddenly disappear. The only problem is that Google-scale is so large that a number of people who are small to Google forms a group that is large by usual standards. I've had a lot of good stuff happen while working with Google on projects but you can really see how much of what they're doing is "let's do this now and make it work", leaving some of the harder work for later. And sometimes later never comes and the service goes away.

If you know this and go into it, eyes open, then you won't get hurt but it's a shame when something as important as the data storage and curation of the informational output of an entire species, and associated civilisations, isn't quite cool enough to keep things going.
posted by nfalkner at 5:36 PM on February 2, 2015 [1 favorite]


Bah, google would never discontinue anything of real world importance.
Google Health has been discontinued

Google Health has been permanently discontinued. All data remaining in Google Health user accounts as of January 2, 2013 has been systematically destroyed, and Google is no longer able to recover any Google Health data for any user. To learn more about this announcement, see our blog post, or answers to frequently-asked questions below.
posted by benzenedream at 6:16 PM on February 2, 2015 [6 favorites]


Jill Lepore, The Cobweb: Can the Internet be archived? - "The Web wasn’t built to preserve its past; the Wayback Machine aims to remedy that."
The average life of a Web page is about a hundred days. Strelkov’s “We just downed a plane” post lasted barely two hours. It might seem, and it often feels, as though stuff on the Web lasts forever, for better and frequently for worse: the embarrassing photograph, the regretted blog (more usually regrettable not in the way the slaughter of civilians is regrettable but in the way that bad hair is regrettable). No one believes any longer, if anyone ever did, that “if it’s on the Web it must be true,” but a lot of people do believe that if it’s on the Web it will stay on the Web. Chances are, though, that it actually won’t. In 2006, David Cameron gave a speech in which he said that Google was democratizing the world, because “making more information available to more people” was providing “the power for anyone to hold to account those who in the past might have had a monopoly of power.” Seven years later, Britain’s Conservative Party scrubbed from its Web site ten years’ worth of Tory speeches, including that one. Last year, BuzzFeed deleted more than four thousand of its staff writers’ early posts, apparently because, as time passed, they looked stupider and stupider. Social media, public records, junk: in the end, everything goes.
posted by the man of twists and turns at 6:23 PM on February 2, 2015 [1 favorite]


They also abandoned Google Answers for no apparent reason.
posted by mmiddle at 6:44 PM on February 2, 2015 [1 favorite]


Is there any way to search a range of dates, though?
Not on Google News Archive, no. There used to be a work-around in the advanced news search which would let you do that, but it was always a very dodgy search. Looking for Melbourne in the Age, for example, would deliver no results (unlikely given the Age is published in Melbourne). Now that doesn't seem to work at all and browsing by date seems to be the only way to access that historical content. Either that or their searching has gotten a lot worse.

There are a lot of other sites that do offer more in the way of searching, but might not have the content or coverage you're after. There's also a lot of subscription databases that you might be able to access through a library that subscribes to them. It's probably getting a little too specific for a comment, but if you memail me I'm happy to see if I can come up with any suggestions for you.
posted by Athanassiel at 6:52 PM on February 2, 2015 [1 favorite]


Extra funny for me because I just watched the movie Beginners yesterday.

These days I filter all-outgoing Google links into /dev/null and am a fan of DuckDuck. If I could afford to live nearby, I'd be delighted to work at Internet Archive.

But I do owe Google for one thing: the exposure to what research is like when horizons are almost unlimited. (I was able to use Google Books to research a question about Audel in much more depth than even the NYPL's famed staff had previously been able to help me with.)

Let's make that possible for everyone everywhere forever. (See, I can still dream too. I can thank libraries for helping me hold onto that for so long.)

Maybe it could be named for Paul Otlet so it wouldn't take me a half-hour to find his name again - like it just did.
posted by Twang at 7:08 PM on February 2, 2015 [2 favorites]


Something weird I find with google books is that I can sometimes get a book in the US but not in the UK, even when the text should be public domain in both places.
posted by Peregrine Pickle at 7:14 PM on February 2, 2015


This article is so true. I have been in the room on a number of occasions with parties involved in similar sorts of relationships - that is, between libraries/librarians, and what might be called the computer science/engineering partners/suitors (none of them google btw) - and almost without exception, the latter have tended to view librarians as people who arrange things on shelves, and libraries as poorly run databases. 'Just wait until we improve your current inefficient practices with our shiny new computational tools and services,' is the line, and all goes well until it is finally realized that it's not as easy as all that, and that libraries and librarians actually do very complex things indeed, things that are hard to model successfully in build-it-from-the-ground-up code. This has been demonstrated in projects with no money, and projects with Very Large Sums Of Money Indeed. FWIW, judging by the discourse, I think a lot of people who used to be involved in this are now involved in pushing 'Big Data.'

Anyway, here's an oldie but goodie from 2009, and Geoff Nunberg: Google Books: A Metadata Train Wreck.
posted by carter at 7:21 PM on February 2, 2015 [2 favorites]


This story felt very topical after reading earlier about the Governor of Indiana proposing massive cuts to the Indiana State Library including cutting a specific service with the argument that it could be replaced by using Google Scholar, Microsoft's Academic Search and ancestry.com. Even currently, that ignores the question of restricted access to search results but it also seems particularly dangerous to rely on a commercial offering without any sort of agreement.

I am probably biased since I work for — but do not speak for in any capacity — a library on a project which also makes scanned items freely available online, albeit on a much smaller scale than Google books. I've found it disturbing in the past to hear comments at meetups, conferences, etc. where senior-level people from many organizations mused about outsourcing everything to Google Books to avoid the need to have in-house expertise or, often, even simple IT resources for things like bulk OCR and full-text search. That talk has mostly dried up in the last couple years as mixed experiences have emerged to temper the initial enthusiasm and, perhaps unsurprisingly, it turns out that a LOT of librarians used Google Reader and they have not forgotten what happened to it.

On a similar note, I love what the Internet Archive does – and donate yearly – but there's a similar concern about relying too much on any single organization. After some recent consolidation in the web archiving field, a colleague wrote a great piece musing about the Library of Alexandria v2.0 and how critically important it is for not just the data but also the domain knowledge and experience to be spread widely.

Applying similar logic, the optimistic take is that some warning signs with Google might lead to a healthy discussion about the importance of not letting current enthusiasm lead to single points of failure before we have any particular disasters to prove the point.
posted by adamsc at 7:43 PM on February 2, 2015 [7 favorites]


tavella: Not really? Google owns the photos even if the item itself is in public domain.

I’m not a lawyer, but, for the US at least, this is probably untrue: a faithful 2D photograph of a 2D original is not itself copyrightable (even if it requires work to create), so if the original is public domain the reproduction is also. This was covered by Bridgeman Art Library v. Corel Corp. (1999) which found that "a photograph which is no more than a copy of a work of another as exact as science and technology permits lacks originality.” This is the rationale Wikimedia use to justify their use of photographs of public domain images.

Copyright is hard.
posted by Quinbus Flestrin at 10:23 PM on February 2, 2015 [6 favorites]


I'm sad about Google Books. I know some of the folks who started it, it was an idealistic project of bringing the power of digitization and archiving to old print books. But as Jessamyn's piece notes the details of copyright and contracts and pushback from traditional publishing makes it complicated. And it wasn't going to make any money. You can't trust a for-profit company to do a major project like this out of the goodness of their hearts.

You can trust a mission-driven non-profit though. Like the Internet Archive's book scanning efforts. That organization is amazing, and it's not all just fun browser-hosted DOS games. Brewster is quite sincere about creating a modern Library of Alexandria (with replication, to avoid the obvious failure mode.) The organization is worth your support.

I worry about the archives that Google uniquely owns. The web crawl, for instance. Back in the monthly crawl days they were carefully archiving copies of everything every month, they had the data to set up something like Archive's Wayback Machine. That got more complicated with realtime crawls, but IIRC they still have a snapshotting capability. I hope they are keeping them safe somewhere for use some day. No one else really can; even Archive.org's copy is pretty limited.

Another archive Google has near-uniquely is Usenet. Eugene Spafford's earliest Usenet archives are available on archive.org, but there's a middle-ground in the mid-to-late 90s that is unique. Google Groups barely hangs on as a product, and as Andy's piece notes they fucked up the search pretty bad on it. But it survived the Dark Google Plus years, so maybe Groups will have a future.
posted by Nelson at 10:25 PM on February 2, 2015 [4 favorites]


Jessamyn: this was beautiful.

Metaquarry: I used to build ils software, and went through exactly the same doubts as you outlined in your post (dealing with unimarc and z3950 this side of the Atlantic). Thanks for putting it in words so precisely.

Google's embrace, extend, extinguish looked different than its predecessor, but the end result seems eerie. Let's hope the internet archive is the phoenix to their internet explorer
posted by motdiem2 at 11:14 PM on February 2, 2015 [2 favorites]


I’m not a lawyer, but, for the US at least, this is probably untrue: a faithful 2D photograph of a 2D original is not itself copyrightable (even if it requires work to create), so if the original is public domain the reproduction is also. This was covered by Bridgeman Art Library v. Corel Corp. (1999) which found that "a photograph which is no more than a copy of a work of another as exact as science and technology permits lacks originality.”

Yeah, and if you think about it, the other way lies madness. If you could assert copyright in a photograph of a public domain image because you'd taken the photograph, you could have a hypothetical scenario where somebody destroys a public domain original in order that their copyrighted photograph of it is the only extant copy. (Step 4: Profit!)

Note the phrasing on Google's page that Jessamyn linked in her article. They're Guidelines, not terms and conditions. "We do ask that you follow some basic guidelines regarding their use." Ask, not require, because they can't require it. Please follow these guidelines, so that we benefit from the public domain work of others - pretty please?

No. Strip the watermark, rehost them, redistribute them, and stick two fingers up to the Google Terms of Service. The public domain means something.
posted by rory at 2:15 AM on February 3, 2015 [3 favorites]


Lovely piece of writing by Jessamyn.

"Google owns the photos even if the item itself is in public domain."

No! The short version is that scanning a public domain document produces a public domain scan. No copyright is created.
posted by LarryC at 3:14 AM on February 3, 2015


Strip the watermark, rehost them, redistribute them, and stick two fingers up to the Google Terms of Service. The public domain means something.

The Internet Archive does host a fair number of the PD Google books (with the "guidelines" intact) so I often tell people if you find something on Google, you should look for it on IA. And as I mention in the article and as people here mostly know IA pays me (a little) to help keep Open Library barely afloat. It's the front end to some of the book scanning stuff (which, along with Archive-It are good revenue streams for IA) but IA is starting their own parallel lending thing which is making things a little confusing. And I trust IA. At the same time, they don't have the resources, the human resources, to really beef up their offerings. So we have a "Search Inside" feature that doesn't work if you type more than one word in the search box and a Subject search that fails in a similar fashion. And we've put trouble tickets in, but in a project of their scope and magnitude, these issues don't float to the top very often. They have a new head of engineering who I really like and am hopeful about.

So it's challenging. Because even though I can't think of a better organization than the Internet Archive to do this (and other) important work, they're still more of an archive than what I think of as something that is truly a library: a place where there are people who can help you find things, a place where not just having the stuff but making the stuff available, to everyone, is their reason for being. IA made great strides with their new design and I'm hoping they'll take big steps with the functionality of their Search features so that people who want things they have can find, manipulate and interact with that content. The one thing Google got right is realizing the primacy of the search experience to most people's user experience of trying to get at information.
posted by jessamyn at 7:41 AM on February 3, 2015 [6 favorites]


who will remember the past?

or remember that it was even forgotten?
posted by kliuless at 9:14 AM on February 3, 2015 [3 favorites]


From way upthread:
>>Were you around here when THEY KILLED GOOGLE READER (shaking with rage my god I'm still not over it)

> I think you and Google have different definitions for popular and well used. Especially on the scale of Google.


The thing is, killing Google Reader saved them no money - "on the scale of Google", as you say, it was beneath notice, but the segment of people who loved Google Reader (a) were readers and traffic drivers and (b) will never forgive Google for killing it.

I hear that it was an effort to drive traffic through Google+ instead (how's that working out?) but I can't believe the increase in traffic was more valuable than harvesting the information from people's usage of Reader. Which articles get starred? Which get saved to re-read or read later? Which show up later on in the Twitter firehose? Google couldn't monetize that treasure trove of information to pay the pennies to keep a Google Reader server ticking in a closet somewhere?

No, this is another case of a corporation with a short attention sp

I love that username, forgive me.
posted by RedOrGreen at 1:18 PM on February 3, 2015 [3 favorites]




It's not that I think Google has an obligation to maintain Books or Reader or anything else. Clearly when something no longer fits their vision, they don't have to keep doing it, and that's fine.

The city has just decided that its not that much fun shoveling shit and filtering the water supply anymore in smelly old cesspools so their employees are going to work on drawing new road plans for driverless cars instead.

Public goods and services =! free goods and services.
posted by infini at 2:42 AM on February 5, 2015 [1 favorite]


Were you around here when THEY KILLED GOOGLE READER (shaking with rage my god I'm still not over it)

Killing the useful parts of Google Code was the last straw, for me. Moved my projects off and onto GitHub and Bitbucket, and never looked back.
posted by a lungful of dragon at 11:22 AM on February 5, 2015 [1 favorite]


Come to think of it, reliance of open-source science software projects on Google Groups for discussion threads bothers me, too. A treasure-trove of useful discourse and code could seemingly go up in a puff of entropy, should Groups be suddenly the latest project to be put up on the company's chopping block. I guess a mirror could be reconstructed after the fact from people's email archives, like a forensic crime scene...
posted by a lungful of dragon at 11:29 AM on February 5, 2015


People interested in web archiving should check out Perma.cc:
When a user creates a Perma.cc link, Perma.cc archives a copy of the referenced content, and generates a link to an unalterable hosted instance of the site. Regardless of what may happen to the original source, if the link is later published by a journal using the Perma.cc service, the archived version will always be available through the Perma.cc link.
Memento is a Chrome and Firefox extension that lets users browse web pages in time.

I learned about both from Jill Lepore's article in The New Yorker: Cobweb
posted by the man of twists and turns at 1:11 PM on February 5, 2015 [2 favorites]




Speaking of Google killing services, they just axed Google Talk. I guess they're trying to force people to use Hangouts (good luck with that).
posted by ryanrs at 12:33 PM on February 10, 2015 [2 favorites]


« Older Club Nintendo program ending   |   All My Blogs Are Dead Newer »


This thread has been archived and is closed to new comments