Get Hosed
April 12, 2010 5:16 PM   Subscribe

Yahoo is releasing a new service: Firehose, a real-time, searchable index of social content aggregated from around the web. Accessible via YQL, Yahoo’s SQL-like query language, the Firehose will gather data from status updates, user ratings and reviews, comment threads, Google Buzz, Flickr, Delicious, Twitter, YouTube, Last.fm and a range of other sites and apps. [via]

The firehose will provide a stream of real-time data from Yahoo's index, which will also include Twitter data, as part of a deal the two companies made in February. According to Yahoo, the firehose will provide access to more than 150,000 ratings, 8,000 reviews and 750,000 comments a day.
posted by netbros (34 comments total) 8 users marked this as a favorite
 


Sounds like the biggest thing since Google Wave.
posted by mccarty.tim at 5:22 PM on April 12, 2010 [7 favorites]


I can't imagine trying to filter through that much noise to get anything useful.
posted by Pontius Pilate at 5:25 PM on April 12, 2010


Poor /..
posted by barnacles at 5:28 PM on April 12, 2010


I'd prefer this.
posted by clvrmnky at 5:29 PM on April 12, 2010 [1 favorite]




or relatin' dudes to jazz
posted by pappy at 5:37 PM on April 12, 2010


I can't imagine trying to filter through that much noise to get anything useful.

If only there were machines capable of analyzing large sets of data!
posted by chunking express at 5:41 PM on April 12, 2010 [3 favorites]


If they give it all the attention they gave Fireeagle (a location service) I give it a week before it's abandoned. They have some good ideas but they're not so good on the follow-through.
posted by tommasz at 5:47 PM on April 12, 2010


I like how when you execute the first test query in the blog post (search for everything that talks about cryptozoology), you get an authentication error because you're executing the query from a console and not an app. I think. The message isn't very clear. In any case, the lack of a functioning console makes it somewhat hostile to dabbling developers, no?
posted by chrominance at 5:58 PM on April 12, 2010


Google indexes 4chan in real time. You can post some unique text in a thread, search google for it, and your thread shows up in the search results immediately. Freaky.
posted by mullingitover at 6:02 PM on April 12, 2010


chrominance: it works if you're logged into a yahoo account (for me anyway, I haven't done anything special to activate my account that I know of).
posted by Skorgu at 6:09 PM on April 12, 2010


And frankly being able to
select * from flickr.photos.search where text="Cat" limit 10
makes all this internet stuff worthwhile. (And get this back.)
posted by Skorgu at 6:16 PM on April 12, 2010 [3 favorites]


mullingitover: Do they? That didn't work for me.
posted by finite at 6:17 PM on April 12, 2010


Interesting. Firehose is the name of Twitters now mostly discontinued service to download all status updates. Now they have "spritzer" for general use -- which gives you 5% of tweets.

They also have a "Gardenhose" option that gives you like 15%. The spritzer was like 300kbps when I tried it, but there is a ton of extra crap in there, for example they give you the complete URLs for the image icon and background image for each user on each tweet.

The ability to get Delicious bookmarks is interesting though. Delicious offers feeds but they don't update very often, certainly not real time.
posted by delmoi at 6:32 PM on April 12, 2010 [1 favorite]


If only there were machines capable of analyzing large sets of data!

Right. Machines are not magic, exactly, though. I'm saying it's going to be pretty impressive if it will actually be possible to get something useful from "150,000 ratings, 8,000 reviews and 750,000 comments a day," because such a huge portion of Twitter, YouTube, etc. comments are just garbage.
posted by Pontius Pilate at 6:39 PM on April 12, 2010


I can't imagine trying to filter through that much noise to get anything useful.

People tell me that about the internet all the time.
posted by krinklyfig at 6:47 PM on April 12, 2010 [2 favorites]


The thing that's tough is the filtering. We already have plenty of content to sort through.
posted by mccarty.tim at 6:50 PM on April 12, 2010


What's to stop anyone from doing this themselves? I'm not sure why Yahoo needs to be involved. I'm not saying anyone can build a website, but anyone with moderate skill and a little time can code a site to draw in this type of information. Ah, I see they have an API for it as well as something they call YQL, which I suppose is like SQL with extra Yahoo. Although I am continually amazed at the ability of aggregated data in social networks to draw so much attention and money, I am once again reminded how Yahoo basically repackages for a living and hasn't been a truly great resource since the days when actual humans reviewed entries into their directory, sadly long dead. In this case, they're repackaging it into a big hose, which they will fire in your general direction while your helpless body flails about, cascading against the walls, a ragdoll to the giant tidal wave of data crashing down upon it.

So, it's free, right?
posted by krinklyfig at 6:59 PM on April 12, 2010 [1 favorite]


finite: It doesn't seem to work if you just throw up a new thread with a hash in it and it goes to page 15 without any comments, but I've seen threads where someone linked to a google search which links back to the very same thread. It may depend on the number of replies.
posted by mullingitover at 7:00 PM on April 12, 2010


achem...
posted by DZack at 7:49 PM on April 12, 2010


Can anyone show me where in the Yahoo apis I can access the Twitter firehose??? I think this 'firehose' business is misleading.
posted by kuatto at 8:35 PM on April 12, 2010


Man what a let down! I was expecting the firehose, instead I get paginated queries??

Am I missing something here?
posted by kuatto at 8:41 PM on April 12, 2010


I'm saying it's going to be pretty impressive if it will actually be possible to get something useful from "150,000 ratings, 8,000 reviews and 750,000 comments a day"...

I would route your Yahoo Firehose through some Yahoo Pipes into useful RSS feeds.
posted by Evilspork at 9:04 PM on April 12, 2010


Ah, I see they have an API for it as well as something they call YQL, which I suppose is like SQL with extra Yahoo.

The best way to describe YQL is SQL that reads data from the internet. Literally. Yahoo! scrapes sites and collects data, then allows you to filter and sort it using SQL-like commands. Say what you will about the rest of it, YQL itself is pretty damn fantastic. Skorgu's comment above is a great illustration of the simplicity and power.
posted by davejay at 9:53 PM on April 12, 2010 [1 favorite]


I would route your Yahoo Firehose through some Yahoo Pipes into useful RSS feeds.

Love Yahoo Pipes.
posted by A Terrible Llama at 3:01 AM on April 13, 2010


I want this to grow enough that "I hosed you" can replace "I Googled you" in popular speech.
posted by rokusan at 3:44 AM on April 13, 2010


I've been dicking around with sweetcron for a few weeks now, trying to build my own personal firehose for all the places I do things on the Internet. It is hard work. It'll be nice if this Yahoo offering makes the whole personal firehose building a bit easier for the layperson to do.
posted by sciurus at 6:43 AM on April 13, 2010


Paging Garry Shandling
posted by marienbad at 8:19 AM on April 13, 2010


I'm confused as to how this service is any different from, say, Kikin (except that you don't have to be signed into Yahoo to use it). I found the Kikin app very distracting and uninstalled it after a week.

I get that this is more customizable than Kikin, but other than being branded by Yahoo, what's the fundamental difference here? Sorry if this seems obtuse - I'm hoping someone smarter than me that's looked at both can give me the quick-and-dirty comparison.
posted by Unicorn on the cob at 9:24 AM on April 13, 2010


I'm confused as to how this service is any different from, say, Kikin

Well for one thing, it doesn't play an annoying and loud Flash movie when you visit its home page.

Also, the fact that the main link above is to something called the "Yahoo! Developer Network Blog" should be a hint as to what the intended audience is for this service. Kikin, to me, seems to be a consumer focussed app. This is a data service that Yahoo! are expecting developers to build cool, interesting and more 'consumer obvious' apps and mashups on top of.

The service Yahoo! themselves are building on top of Firehose seems to be very similar to Google Buzz.
posted by robertc at 1:39 PM on April 13, 2010


feh. there is only one true fIREHOSE!!!
posted by Rube R. Nekker at 7:57 PM on April 13, 2010


On the one hand, this is very interesting.

On the other hand, this is Yahoo.
posted by DU at 5:55 AM on April 14, 2010 [1 favorite]


Thanks, robertc! In my mind I just regressed to when Snapfish was new and then Flickr/Kodakgallery/etc. showed up and it was a bloodbath. You just clarified things immensely.
posted by Unicorn on the cob at 10:11 PM on April 14, 2010


« Older "Our mission statement is to spread the gospel of...   |   You know I did it cuz I left my mark Newer »


This thread has been archived and is closed to new comments