You may have heard about n-grams, which identify particular strings of text in a large corpus (an n=3 n-gram could be "plate of beans"). You probably have played with Google Ngram search which lets you look through millions of books to see the first use of the phrase, or when it was most popular (though be warned, recent research shows some limitations, such as the false popularity of a certain expletive in the 1700s). The newest is the Reddit ngram search by 538, which lets you chart the rise and fall of things progressive and regressive. I await more insights in the discussion...
Despite its aging interface and its slightly misleading name, The Old Fulton New York Postcards site is an amazing tool for anyone doing any kind of historical research. It is a huge searchable archive of american and canadian newspapers.
"If Google sees that you're searching for specific programming terms, they'll ask you to apply for a job. It's wild." "I typed 'request; and half expected to see 'Follow the white rabbit, Max.' Instead, the screen displayed a paragraph outlining a programming challenge and gave instructions on how to submit my solution. I had 48 hours to solve it, and the timer was ticking."
In 1945, Vannevar Bush described a physical storage, search and retrieval system that worked like an early hypertext. He called it a memex. Earlier this year, DARPA released the open-source components for it's own project named Memex, a powerful engine for searching the deep, dark web. [more inside]
I listen to one of the two or three key brains behind the Search algorithm itself, Ben Gomes, who speaks 10 to the dozen of “natural language generation” and “deep learning networks” (and, inevitably, of the “holy grail” of answering users’ questions before they have been asked). [more inside]
What you do when apartment hunting online, and what a lot of people do, I imagine, is you plug in your preferred neighborhood/price range/amenities/etc., and then out pops a long list of results that you further refine by imagining a very specific and very fictionalized narrative involving a version of yourself that isn’t necessarily true right now but could be true if you lived in apartment X. No, you’ve never wielded a wrench for any longer than the time it takes to pass it to your dad, but why couldn’t you fix a fixer-upper? Or be the kind of person to share one bathroom with six other roommates? Or live with a Ukrainian family that’s gone for five months out of the year, but whose kids you’re expected to babysit as per your new rental agreement?
The Wall Street Journal reports on how Google favored its own shopping, travel services over rivals, and the U.S. antitrust probe of Google:
The 160-page critique, which was supposed to remain private but was inadvertently disclosed in an open-records request, concluded that Google’s “conduct has resulted—and will result—in real harm to consumers and to innovation in the online search and advertising markets.”Is Google an unelected superpower? A truly sinister social networking platform could manipulate public opinion even more effectively. (Previously)
Now in open beta, SCOTUS Search allows users to "search the text of 1,424,780 individual statements within 6,683 Supreme Court oral arguments." [more inside]
How A Chicago Man Hampered His Own Rescue From The Columbia Icefield, And What Searchers Learned From Him.
When you ask members of the Jasper Parks Canada visitor safety team if they remember the search for George Joachim, a common response is a deep sigh, and something like: “Ah yes…George.” Four years later, the name still conjures head shaking and wary glances. ... Joachim unintentionally misled searchers by listing his destination incorrectly in the climber’s registry, and then behaved so unlike other people previously have in his circumstance that he was repeatedly missed in the search. Parks Canada’s search and rescue community considers his case a valuable learning experience and have since tweaked search protocols to account for other behavioral outliers.via BLDGBLOG: Algorithms In The Wild
Search Engine Land (December 27, 2014): "The Yahoo Directory, the core part of how Yahoo itself began in 1994, officially closed today, five days ahead of when Yahoo had said the end would come." The Internet Archive save of Yahoo for October 1996. [more inside]
This morning, the Supreme Court released an opinion (pdf) in Heien vs. North Carolina, finding that because the Fourth Amendment requires government officials to act reasonably, not perfectly, and gives those officials “fair leeway for enforcing the law,” an officer in North Carolina did not act unconstitutionally when they stopped and searched a car driving with a broken brake light, even though North Carolina law requires only one vehicle brake light to be working. [more inside]
In 1961, one dogged black woman took a stand against illegal police tactics. Today the fine folks at The Marshall Project profile one very important American you probably know almost nothing about. [more inside]
Aunt Bertha is a web-based platform that connects Americans in need to locally available government programs, non-profit organizations, and community-based resources that offer free or low-cost assistance with health and dental care, job placement, emergency and long-term shelter, clothing and household goods, child and elder care, legal aid, assistance with navigating the social safety net, and much more. All programs are searchable and sortable by ZIP code, city, or eligibility. Find food, health, housing, job training programs and more, anywhere. [more inside]
Following a record-breaking $750 million syndication deal with parent company Fox, the FXX network most recently made headlines back in August with its twelve-day marathon of Every. Simpsons. Ever. But that was just the prelude to the real deal launching today: Simpsons World, a staggeringly comprehensive multiplatform video database including clips, news, featurettes, curated playlists, a heartbeat tracker of each season's popularity, and (for the intrigued who'd like to subscribe to their
newsletter network) on-demand streaming of all 552 episodes and counting. Coming early next year is an even greater expansion of features, bringing full-series dialogue search, real-time script tracking, and "geolocation" of all scenes throughout Springfield -- something very close to Myles McNutt's vision for a shareable Simpsons clip database (previously).
I, for one, welcome our new Simpsons-quoting overlords. [more inside]
Here's how one small company is slowly, surely beating its way into the most monopolized category in technology: Inside DuckDuckGo, Google's Tiniest, Fiercest Competitor.
The Upshot asked: Where are the hardest places to live in the U.S.? (A bit more on the ranking.) Now, given continuing economic divergence (previously): What do the two Americas search for?
The Supreme Court has unanimously reversed (large PDF) the California Court of Appeals in Riley v. California, deciding that police cannot search the contents of a phone without a warrant during an arrest, and that "the fact that technology now allows an individual to carry such information in his hand does not make the information any less worthy of the protection for which the Founders fought." [more inside]
The depth of the problem - this WaPo infographic hints at the immense challenges that Australian and Chinese search teams will face in recovering the Malaysia Airlines Flight 370 black box from its suspected location at the bottom of the Indian Ocean
"After two decades online, I'm perplexed. It's not that I haven't had a gas of a good time on the Internet. I've met great people and even caught a hacker or two. But today, I'm uneasy about this most trendy and oversold community. Visionaries see a future of telecommuting workers, interactive libraries and multimedia classrooms. They speak of electronic town meetings and virtual communities. Commerce and business will shift from offices and malls to networks and modems. And the freedom of digital networks will make government more democratic. Baloney. Do our computer pundits lack all common sense? The truth [is] no online database will replace your daily newspaper, no CD-ROM can take the place of a competent teacher and no computer network will change the way government works." A view of the Internet's future from February 26, 1995 at 7:00 PM
Two weeks ago, 14 year-old Avonte Oquendo was last seen running out the door of his school in Long Island City, New York. Because Avonte has autism and is non-verbal, he was supposed to have one-on-one supervision at all times. Now, an unprecedented citywide search for the boy that includes searching commuter trains and subways and playing his mother's voice out of emergency response vehicles remains underway. [more inside]
Haberdashboard runs an organized eBay search on quality menswear brands in your size(s), and includes some nice search refinement options.
Wedding Crunchers: An n-gram analysis of wedding announcements in the New York Times going back to 1981. See, for example, the decline in elite prep schools, how well the five boroughs are represented, or the rise (and fall) of hedge fund managers among the newly wed. The site's creator offers a more detailed look over at Rap Genius.
Noah Veltman gives us a comparison of Google Search Suggestions By Country for America, Canada, the UK, Australia, and New Zealand.
Stop and Frisk violated the constitutional rights of New Yorkers, federal judge holds. The ruling comes after the two-month trial in Floyd v. City of New York and finds the tactics and policies of the NYPD in conducting stop-and-frisk systemically violates both the 4th and 14th Amendments of New Yorkers of color. Stopping short of striking down stop-and-frisk more broadly, already upheld numerous times by the Supreme Court, Judge Scheindlin ordered an independent monitor to oversee reforms to the practice.
Enter some text about your interests or research topic into the Serendip-O-Matic, and get an intriguing array of related images and primary sources from the Digital Public Library of America (DPLA), Europeana, and Flickr Commons. A One Week | One Tool project.
DEC - I mean Digital - I mean Compaq - er, CMGI - no, Overture; rather - Yahoo ... will shut down AltaVista for good next week.
The Memory of the Netherlands is an image library making available the online collections of museums, archives and libraries. The library provides access to images from the collections of more than one hundred institutions and includes photographs, sculptures, paintings, bronzes, pottery, modern art, drawings, stamps, posters and newspaper clippings. In addition there are also video and sound recordings to see and listen to. The Memory of the Netherlands offers an historic overview of images from exceptional collections, organized by subject to provide easy accessSearch 833928 objects from 133 collections from 100 institutions.
With the deeply unpopular shutdown of Google Reader less than two weeks away (previously), plenty of would-be replacements have jumped into the mix, including the newly web-based Feedly, Newsblur, Digg, and possibly even Facebook (a particularly bitter irony, as obsession with defeating Facebook has been the alleged impetus behind CEO Larry Page's abandonment of beloved Google hallmarks like 20% Time, Google Labs, and open platforms like Reader). But while there's no shortage of attempts to replicate Reader's look and feel, there's one little-known aspect that none can match, and that will be lost forever come July 1st: the vast cache archive of every article from every website, living and dead, that has ever been subscribed to in Reader. [more inside]
Search for wildflowers by location, color, flower shape, flower size and time of blooming. 3,126 plants indexed. This web site helps those of us with limited knowledge of botany to identify flowering plants that are found outside of gardens. This help is provided by presenting you with small images of plants. You can use a number of search techniques to get to the images that are most likely the plant you are looking for. When you click on a plant image the program shows you links to plant descriptions and more plant images. The site has about 5 ways of searching for a plant. You can use these searches in any combination. Some searches eliminate some plants from consideration. Most searches give a "score" to each plant depending on how well the plant matches the search criteria. The plants with the highest score are displayed at the top of the results. Click here for Instructions. [more inside]
For years, Google Maps has been the map of our world in a historically unprecedented way. The new Google Maps (announcement) will eschew the uniformity of the old Maps and instead customize the map experience based on a user's behavior. Some are concerned how this artificial narrowing will affect the way we experience places and relate to our urban spaces. Others believe the customization makes the new maps more honest. Most, however, will probably just want to comment on the huge overhaul to the interface.
I turned around to face an approaching figure. It was Larry Page, naked, save for a pair of eyeglasses. “Welcome to Google Island. I hope my nudity doesn’t bother you. We’re completely committed to openness here. Search history. Health data. Your genetic blueprint. One way to express this is by removing clothes to foster experimentation. It’s something I learned at Burning Man,” he said.
An interactive demonstration of different algorithms for finding the shortest path from one point to another on a uniform grid. [more inside]
OMGif! [Wired] "On Tuesday, Google announced via Google+ that Image Search now has an “Animated” filter. That means that if you’re only searching for animated magic, you need never be bothered with a still image again. Finally that search for Jennifer Lawrence GIFs from the Academy Awards just got a whole lot easier."
We've all seen it. The off-white UAV is seen side on, nose tilted slightly down, a stubby missile caught at the moment of launch beneath it, a blue and grey landscape of treeless mountains behind it. There's no motion blur and none of the markings on the aircraft have been obfuscated. It's a perfect shot. Except for one or two details. [more inside]
Global Internet Porn Habits: An interactive map that lets you see the most commonly searched porn terms by state or country. No porn images, but obviously porn-related language and the word porn in the URL, so whether it is SFW is up to you.
Wired: DHS Watchdog OKs ‘Suspicionless’ Seizure of Electronic Devices Along Border [Source policy document]. Americans may find it useful to note that the definition of 'border' includes up to 100 miles from the nearest actual international border line.
Facebook today announced their Graph Search during a live event at their headquarters. Some say it is Facebook's attempt at taking down Google and taking over web search (they did partner with Bing), but more astute observers see LinkedIn, Yelp, and OKCupid in their crosshairs too based on the live event demos. [more inside]
Starting in the early 1700s and exploding in popularity throughout the 1800s, Japanese woodblock prints depicted the fantastic world of Kabuki actors, courtesans, warriors, and nature. Ever since then keeping track of all of the incredible artwork has been a pain, traipsing between dealer and museum websites, awkwardly shuffling through academic library 'websites', wandering aimlessly through GIS, not to mention all the trouble a patron had to go through to see these before the Internets. Well, The Japanese Woodblock Print Database aggregates prints from a number of museums, dealers, and auction houses into a single resource, searchable by keyword and by image, and thereby provides a shining example of web-accessible art database interface. Enjoy! [via mefi projects] [more inside]
For shell grumps and net.curmudgeons and people who think Internet search is just too cluttered with bitmaps, DuckDuckGo (previously) offers TTY search. Sadly, there is no telnet interface, you'll need to use a newfangled web browser.
"To the credit of today's social networks, they've brought in hundreds of millions of new participants [...] but they haven't shown the web itself the respect and care it deserves, as a medium which has enabled them to succeed. And they've now narrowed the possibilites of the web for an entire generation of users who don't realize how much more innovative and meaningful their experience could be." Anil Dash laments The Web We Lost, and offers some suggestions for moving forward.
You want us to pay you for directing eyeballs to your sites? Newspaper publishers in France want a law whereby Google (and other search engine services) have to pay for each click made from the search engine to their sites. You click on a link to a French newspaper site from a search engine, the Search Engine has to pay the newspaper for that click. If the law is passed it's likely Google will no longer include links to French sites that require payment for said links.
The Hathi Trust, a partnership between 66 universities and 3 higher education consortia, is breathing a little easier now that Judge Harold Baer, Jr. of New York's Southern District has found that the Trust was within its fair use rights to allow Google to scan member library holdings, and then making the resulting files available for the reading impaired, and for use in search indexing and data mining. While this is excellent news for the educational institutions involved, it doesn't completely exonerate Google's role in the scanning project. It's notable that just last week Google abandoned it's own fair use claim in settling a different case involving the same book scanning project. Of the four factors used when considering fair use cases, Judge Baer ruled on the side of the Hathi Trust on all four.
The iEconomy: Apple and Technology Manufacturing. Since January, the New York Times has been running a series of articles "examining the challenges posed by increasingly globalized high-tech industries," with a focus on Apple's business practices. The seventh article in the series was published today: In Technology Wars, Using the Patent as a Sword. Related: For Software, Cracks in the Patent System and Fighters in the Patent War. [more inside]
Shrinp.com is a site that does very little and does it well. Stick anything after the domain name (shrinp.com/shrimp! shrinp.com/puggle! shrinp.com/metafilter!) and you'll get a helpfully labeled image of maybe that thing, or maybe not so much that thing, who can tell? The internet, it's very mysterious. Built by our very own 31d1. Approximately as NSFW as you try to make it.