Surveillance Capitalism in the Library and Lab
May 23, 2021 8:17 AM   Subscribe

"For some time now, the major academic publishers have been fundamentally changing their business model with significant implications for research: aggregation and the reuse or resale of user traces have become relevant aspects of their business. Some publishers now explicitly regard themselves as information analysis specialists. Their business model is shifting from content provision to data analytics. This involves the tracking –i.e. recording and storage –of the usage data generated by researchers (i.e. personalised profiles, access and usage data, time spent using information sources, etc.) when they utilise information services such as when carrying out literature searches."

In the meantime, academic publishers get bigger through mergers and acquisitions. Clarivate is acquiring ProQuest which potentially makes library systems party to yet another research-support-and-data-gobbling "integrated enterprise research platform" to rival to other data-devouring behemoths. It also is likely to create more silly brand names haunting academia and the sciences, like Esploro, Symplectic, and Converis.
posted by zenzenobia (15 comments total) 30 users marked this as a favorite
 
BRB buying bitcoin to donate to sci-hub.
posted by tigrrrlily at 9:26 AM on May 23, 2021 [11 favorites]


If it works on kids why not level up?

[I mean, aside from all the reasons]
posted by chavenet at 9:32 AM on May 23, 2021 [2 favorites]


Rescue Mission for Sci-Hub and Open Science: We are the library. - r/DataHoarder
- This effort is completely unaffiliated from Sci-Hub, no one is in touch with Sci-Hub, and I don't speak for Sci-Hub in any form. Refer to Sci-Hub.do for the latest info.

- This is a data preservation effort for just the articles, and does not help Sci-Hub directly. Sci-Hub is not in any further imminent danger than it always has been, and is not at greater risk of being shut-down than before.
posted by Bangaioh at 9:42 AM on May 23, 2021 [11 favorites]


You Won't Believe This One Weird Tip About Magicicicada cassinii
posted by dannyboybell at 11:58 AM on May 23, 2021 [3 favorites]


As a researcher myself, I'm not really sure what the hand-wringing is over. Unless it's a lemonade stand, companies collect information on their customers. The authors of this piece claim that these practices could "entail a violation of academic freedom and the freedom of research and teaching", but as far as I can tell they don't describe an existing or hypothesized causal mechanism. Plus, unless I missed something, doesn't this sort of data collection already fall under GDPR?

I suppose my main concern is that resulting recommendation algorithms could contribute to the already-pervasive "rich get richer" phenomenon in academia, wherein high-profile labs, authors, and papers collect more citations by dint of having more citations. But I could just as easily see such data being used to mitigate this kind of problem, and indeed my preferred literature review tool (run by a non-profit) has some of these sorts of features.

Of course, it's possible that I'm insufficiently dystopian-minded. I hate the big academic publishers as much as anyone and I'm not here to defend them, but in this particular case I'm not too disturbed.
posted by hoyle at 12:58 PM on May 23, 2021 [1 favorite]


"Rescue Mission for Sci-Hub and Open Science: We are the library. - r/DataHoarder"

This is reporting that the entire sci-hub database fits in 77TB data. Text compresses well, it seems.

That's a achievable amount of storage for an individual, and the data's currently available for download.

I have to say that having a backup copy of the past century's research in all disciplines sitting in my office sounds pretty tempting.
posted by justsomebodythatyouusedtoknow at 2:55 PM on May 23, 2021 [6 favorites]


Not that I'm especially keen to pay $35 for a PDF download, but this was the inevitable result. If you're not the customer, you're the product.
posted by ocschwar at 4:29 PM on May 23, 2021 [1 favorite]


Gabe Newell was right, piracy is a service problem. These companies heard him loud and clear. They took the wrong message from it, however. Their decision was to use the data they were creating to "create new services" but they never bothered to ask if it was anything anyone actually wanted. Of course, they're not worried about whether we, the end users, actually want anything at all, because we're not who they're selling the data to. Because they didn't ask us, they asked the data brokers, and they screamed in unison "give all the data you got!"

To them, it's just another revenue stream, and we're just the angsty hang-wringing populace who doesn't understand how important profit is.

Anyway, I think the main takeaway is that these companies are more aware of their waning public importance than we think, and in a panic they're leveraging every last iota of possible revenue they can to stay relevant and in business. We should listen to them when they show us who they are like this: they are craven capitalists who don't give a damn about science or the betterment of humanity and only give a damn about personal/shareholder profit.
posted by deadaluspark at 5:04 PM on May 23, 2021 [7 favorites]


I despise the for-profits science journals more than almost anything in this world. They are some of the most blatant of corporate parasites, whose profit is based almost entirely on other people's unpaid work and denying the public access to knowledge largely paid for by the public (via taxpayer funded research).

Looking straight at you, Elsevier, the main plaintiff on every major legal case against Sci-Hub (and LibGen).

For-profit science journals are easily in the top five dumbest mistakes the human race has made in the last few decades.

Sci-Hub is one of the greatest things to come out of the whole internet, and Alexandra Elbakyan one of the internet era's greatest heroes.

IMHO
posted by Pouteria at 9:48 PM on May 23, 2021 [16 favorites]


There's a current of data-exploitation FOMO hitting the library sector too, to our shame (I'm a librarian) be it spoken. Too many of us have forgotten young Henry Melnek, or even the Connecticut Four; too many of us don't realize that any data for-profit entities get their grubby paws on, law enforcement can simply buy from data brokers.

("Learning analytics" and researcher "CRIS"es is where you look for this stuff on the academic-library side; on the public-library side, look at library-sector CRMs such as Gale Analytics on Demand, OCLC WISE, and OrangeBoy.)

I'm fighting this with all I have, and I'm grateful not to be alone in fighting it, but wow it's a slog.
posted by humbug at 4:47 AM on May 24, 2021 [7 favorites]


I suppose my main concern is that resulting recommendation algorithms could contribute to the already-pervasive "rich get richer" phenomenon in academia, wherein high-profile labs, authors, and papers collect more citations by dint of having more citations

I'm also concerned about a "rich get richer" phenomenon where people pay to influence these algorithms in order to point scientists towards their preferred research.
posted by little onion at 6:59 AM on May 24, 2021 [2 favorites]


I'm also concerned about a "rich get richer" phenomenon where people pay to influence these algorithms in order to point scientists towards their preferred research.

I got an unsolicited email from what amounts to an academic PR firm (which I refuse to link to; no whuffie for shysters!) just today, as it happens. Pretty sure they web-scrape recent article titles and corresponding-author email addresses to spread their spam.
posted by humbug at 7:52 AM on May 24, 2021


I'm as on-board as anyone to raise a fist and shout, "fuck for-profit academic publishers." But, this sure looks like complete nonsense. Commercially operated corporations? Informational self-determination? The logic of infrastructure privatisation and the consequences this entails? The tools used can be flawed, resulting in even more detrimental consequences for individual researchers? Go home, dad, you're drunk again
posted by eotvos at 10:03 AM on May 24, 2021


Some people really would rather not have data gathered about everything they read or search for. Old-fashioned, I know, but .... it once was possible. For a while post PATRIOT (sic) Act libraries posted warrant canaries. If the signs went down, you would know records had been subpoenaed about what books were being checked out by whom. But now big publishing makes patron privacy increasingly one of those nice things we can’t have.
posted by zenzenobia at 10:12 AM on May 24, 2021 [2 favorites]


So, a couple-three threat models, either current-real-world or near-future.

Threat model one: You get caught in a "fraudulent download prevention" dragnet (a la Aaron Swartz) because you read something that is unusual or out-of-discipline for you. This was suggested by a CISO to a consortium of publishers pushing a specific (and hard to deidentify properly) single-sign-on technology standard.

Threat model two: The likes of Academic Analytics are already promising all kinds of "crack down on those lazy faculty, administrators" capacity, and they'll use whatever data they can get their slimy mitts on to do it. Do you trust your administrators not to use these data to decide whether they think you're doing your job, and constrict your academic freedom if they think you're not? I don't. This also came up recently vis-a-vis the Clarivate-ProQuest merger.

Threat model three: Your reading gets you turned in to ICE, or other law-enforcement agency of your not-choice.

Threat model four: Your reading behavior becomes yet another part of your universal online dossier, sold all over hell's half-acre and used to pigeonhole you and deny you opportunity. Totally already happening.
posted by humbug at 3:26 PM on May 24, 2021 [8 favorites]


« Older 52 perfect comfort films – to watch again and...   |   My Accuracy is Garbage Newer »


This thread has been archived and is closed to new comments