the peculiarities of journal citation data
January 4, 2008 7:25 PM   Subscribe

The scholarly literature forms a vast network of academic papers connected to one another by citations in bibliographies and footnotes. The structure of this network reflects millions of decisions by individual scholars about which papers are important and relevant to their own work. Therefore within the structure of this network is a wealth of information about the relative influence of individual journals, and also about the patterns of relations among academic disciplines. Our aim at is develop ways of extracting this information.

Borrowing methods from network theory, ranks the influence of journals much as Google’s PageRank algorithm ranks the influence of web pages. By this approach, journals are considered to be influential if they are cited often by other influential journals.

C.R. Shalizi, care of the the University of Michigan's Center for the Study of Complex Systems, has more on the history of why all this is necessary and how it came to be.
posted by zennie (22 comments total) 17 users marked this as a favorite
This kind of random walk is indeed very close to how PageRank works. You can use the same method to rank college football teams.
posted by escabeche at 7:42 PM on January 4, 2008

I liked this. And I suspect that the price effectiveness calculator is going to be very handy.

Thanks for this.
posted by djgh at 8:31 PM on January 4, 2008

Very cool. Footnotes and bibliographies are the original hypertext links, and the amount of material in printed books and journals far exceeds the size of the web in both size and quality. It's odd Google has not done anything like this with its GoogleBooks and GoogleScholar library. Must be harder than it seems.
posted by stbalbach at 9:18 PM on January 4, 2008

I dunno...

Field Name:

Top Ten Journals in the Field:

1. J Econ Hist
2. Am Hist Rev
3. Econ Hist Rev
4. Explor Econ Hist
5. Slavic Rev
6. Environ Hist
7. J Am Hist
8. J Mod Hist
9. Continuity Change
10. Labor Hist

Not correct.
posted by LarryC at 10:00 PM on January 4, 2008

Google was originally designed to do exactly that. Look up Backrub.
posted by amuseDetachment at 10:45 PM on January 4, 2008

This is a different non-interactive map of the sciences.
posted by a snickering nuthatch at 11:55 PM on January 4, 2008

Funding organisations often use the Thompson ISI's citation analysis information to determine which people should get funding.

Wikipedia has a decent set of links on the topic.
posted by honest knave at 12:34 AM on January 5, 2008

Thompson ISI Web of Knowledge (citation index database) is like infocrack to me.

It is amazing: you can follow the citations of current work, e.g. RNAi for gene silencing or DNA based nanotechnology, back in time all the way to to the original watson crick discovery of base-pairing. You can then walk forward in time from the WC discovery and see what cited them and what cited the citers and so on, all with hyperlinks to the journals and articles, feeling the tendrils of their influence through the decades, seeing interpenetrating idea-nodes and concept-dendrites flourish and die, moving freely through ideaspace and thoughttime, as cross fertilization between discplines drags us ever forward to consillience of knowledge....

.......I just had an onanistic datagasm. Excuse me.

Also they (Thompson ISI) charge anywhere from $50-$300k per annum for campuswide access for multiple databases. That excludes startup fees which can run into the 100k's. So it's more like high purity infocoke than cheap ghetto freebase. Needless to say, poor univerisies like my own can't afford this stuff, and I long for google's smashing of this particular cartel.
posted by lalochezia at 12:56 AM on January 5, 2008 [1 favorite]

Also, I'm not sure how this adresses 'journal salting'. I should explain: if you want to have a high impact factor or eigenfactor, to be a highly cited or highly networked journal, simply publish reviews. Reviews have many citations and are cited A LOT, bumping up these measurements.

Eigenfactor says they exclude 'review journals' but people like Angewandte Chemie (one of the better chemistry journals at the moment, probably 2nd only to Journal of the American Chemical Society in terms of prestige for a General Chemistry journal) found a way to improve their standing. They 'salt' their journal with minireviews: not enough to be classified as a review journal, but enough to raise their ranking. As their ranking was raised more prestigious people vied to publish there and a feedback loop was set up. This is one of the methods that Angewandte went from a relativley obscure journal in the 80's to the powerhouse it is today.

In fact, both in terms of ISI and eigenfactor, Angewandte trumps JACS but people in the field are wise to this: they know the trick. Papers in JACS are noticably better than Angewandte. This is a classic example of "where easy calculation threatens to overwhelm substantive validity" that people in the field are aweare of - but beancounters and granting agencies may not know about.
posted by lalochezia at 1:18 AM on January 5, 2008

Here's my theory of how to identify the most influential articles:
1 Go the University library.
2 Take the bound journals off the shelves and stack them so you can see the pages, not the spines.
3 Look down the books for pages that are a different colour.
These are the articles that have been most often stolen and replaced by the library, and therefore the ones that are most valuable and important.
(Keen-eyed people will observe many methodological flaws in my system, notably a bias towards the articles on the undergraduate reading list...)
posted by alasdair at 4:22 AM on January 5, 2008 [1 favorite]

Not correct.
posted by LarryC

The "top" journals are different in the map than they are in the search feature:

1. Comparative Studies in Society and History
2. Journal of Economic History
4. Explorations in Economic History
5. Journal of Social History
6. Bulletin of the American Museum of Natural History
7. Journal of African History
8. Environmental History
9. Journal of American History
10. Social Science History
(Search: "history", sorted by eigenfactor score)

I wasn't able to compare my fields' results to the map because there are no specific "Eigenfactor subject" categories for them on the map. Maybe the quirks of the map come down to the way things are categorized; each journal belongs to only one category.

From the FAQ: "...we are interested in mapping science according to what researchers do, not what they say that they do or how they self-identify. One interesting consequence of this approach is that the fields vary widely in size according to their citation behavior. Some fields, such as Tribology (the study of friction) are very small and comprise only a few journals; others fields are very large and contain multiple subdisciplines that might typically be considered separate."
posted by zennie at 6:44 AM on January 5, 2008

There's something I've never understood about the academic citation system - namely, the uber-strictness of HOW you're supposed to cite something. In my uni (at least), you could be penalized if you even got a comma wrong. Yet there are about 1290 styles (and variations across faculties) that could apply, and how they're marked is super inconsistent. I remember getting a Fail (or Near Fail that my lecturer said she was being "generous" on, can't recall) on the referencing section of an assignment when I've used the exact same referencing style on a different assignment and received a Distinction.

The point of citations is to note where you originally got the information from, right? So why does it matter how you cite the thing? If you're internally consistent (so it's always First Name Last Name, Title, Year the whole time), does it matter if it's not Harvard or APA?
posted by divabat at 6:46 AM on January 5, 2008

There is no price effectiveness correlation (?) between "Business and Marketing" and the arts. hmmm.

Fun tool. Like I don't have enough stuff to keep me from working. Is there some way I can "favorite" every single story posted so far today?
posted by nax at 7:09 AM on January 5, 2008

Oops, forgot the link to the original comment.

metafilter: onanistic datagasm
posted by nax at 7:11 AM on January 5, 2008

Seems like a neat idea, but I have the same problem as LarryC: the actual lists of journals in a given field seem off. I looked at linguistics, where they don't seem to be aware of either Language or IJAL, two of the main journals in the field.

The last link is great (as tends to be the case with Cosma Shalizi); I learned a lot about the history of Markov chains and the life of Markov himself from the paper he links to, "The Life and Work of A. A. Markov" (pdf, Google cache) by Gely P. Basharin, Amy N. Langville, and Valeriy A. Naumov (Linear Algebra and its Applications 386 (2004): 3-26). Markov turns out to be a fascinating guy, a rebel all his life and with a mordant sense of humor; check out this anecdote from the last period of his life (at exactly the time of the Kronstadt rebellion and the Tenth Party Congress that suppressed all opposition to the Communists, when all those lucky enough to be still alive after years of civil war and disease were freezing and starving):
On the 5th of March 1921, A. A. Markov communicated that on account of the absence of footwear he is not able to attend meetings of the Academy. A few weeks later the KUBU (Committee for Improvement of the Existence of Scientists), meeting under the chairmanship of A. M. Gorky, fulfilled the prosaic request of the famous mathematician. Time, however, provided a colorful sequel, of sorts, to this. At the meeting of the physico-mathematical section of the Academy of Science on the 25th May, Andrei Andreevich announced: “Finally, I received footwear; not only, however, is it stupidly stitched together, it does not in essence accord with my measurements. Thus, as before, I cannot attend meetings of the Academy. I propose placing the footwear received by me in the Ethnographic Museum as an example of the material culture of the current time, to which end I am ready to sacrifice it.”
The reference to "A.M. Gorky" reminds me that there are several oddities in the paper from a Russianist's point of view—especially odd because two of the three authors are Russian. There is no "A.M. Gorky"; there is Aleksey Maksimovich Peshkov (A.M. Peshkov), who wrote under the name Maxim Gorky (M. Gorky). It's like referring to "Samuel Twain." And this is really weird: "Markov’s protest [at Gorky's exclusion from the Academy] grew to outrage when the nobleman Duke Dundook was unjustifiably, in Markov’s opinion, accepted into the Academy. Markov wrote a distasteful limerick about the situation, which was dubbed unfit for a lady’s ears." This is a reference to a famous epigram of Pushkin's from a half-century earlier mocking the appointment of the nonentity Mikhail A. Dondukov-Korsakov, Pushkin's "Dunduk," to the Academy of Sciences through his intimate relations with the President of the Academy, Sergei Uvarov ("They say Dunduk is unworthy of such an honor; why is he sitting there? Because he has an ass"). It was often quoted in later years on the frequent occasions when similar nonentities got high positions, and presumably Markov quoted it on such an occasion, but how the authors decided he wrote it is beyond me. Also, the article says "For illustrative purposes Markov applied his chains to the distribution of vowels and consonants in A. S. Pushkin’s poem 'Eugeny Onegin'"; you can talk about "Eugene Onegin" or "Evgeny Onegin," but "Eugeny Onegin" makes no sense. Sorry for the mass of irrelevant quibbling, of interest to no one, but I had to get it off my chest.
posted by languagehat at 8:06 AM on January 5, 2008 [2 favorites]


The linguistics listing are grossly off, as LHat noted; they don't include Language (the flagship journal of the Linguistic Society of America) or Linguistic Inquiry (home of all things Chomskyan). The psychology list includes the usual suspects--mostly review journals like Psych Bulletin and reviewish journals like Brain and Behavioral Sciences. What kind of "review" journals do they think they're excluding--Entertainment Weekly?

There's a lot of noise in ranking methods including this one and the Thomson ISI ones. They're interesting enough until promotion committees and funding agencies forget this. Which they do. It's a "we'd rather have something than nothing even if it's flawed" situation.

posted by cogneuro at 10:31 AM on January 5, 2008

reminds me of Middlemarch, and how whatshisface got lost for years in the collecting and sorting of all his notes and references.
posted by amberglow at 2:54 PM on January 5, 2008

This is a field called "bibliometrics." For more information, of course, wikipedia can get you started.
posted by rachelpapers at 7:31 PM on January 5, 2008

Speaking of Markov....
posted by zennie at 9:14 PM on January 5, 2008

posted by LobsterMitten at 9:23 PM on January 5, 2008

History, check. Anthropology, check. Linguistics, check. Sociology, check. Education, check.

English Literature? Apparently, not a subject anyone publishes on -- who knew?
posted by jrochest at 7:50 PM on January 7, 2008

I think it's meant to be sciences and the other disciplines they inter-refer with. No philosophy category either, except philosophy of science. I'm guessing history et al come in as social sciences, or history of science, etc.
posted by LobsterMitten at 8:03 PM on January 7, 2008

« Older Into the Night   |   The Visual Arts Data Service Newer »

This thread has been archived and is closed to new comments