DataSift moved its headquarters from the United Kingdom to San Francisco earlier this month and hired a new CEO, Rob Bailey. Founder Nick Halstead, also founder of TweetMeme, continues to serve as chief technology officer of the firm, which operates TweetMeme as a subsidiary. Where TweetMeme is a consumer website, DataSift calls itself as a platform-as-a-service company--that is, a cloud computing service for building applications.
DataSift has partnered with Lexalytics for sentiment analysis and Klout for influence scores. Meanwhile, the all-important partnership with Twitter was cemented when TweetMeme assisted Twitter with the creation of a Tweet button, modeled on the TweetMeme retweet button.
Because TweetMeme was one of the few companies with access to the full Twitter firehose--the realtime stream of all posts to the service--Halstead said his technology team gained unparalleled experience digesting and processing that feed and categorizing the posts. "We had a lot of companies coming to us, saying, 'We see you doing this extraction--can you do a stream of this data for us?'" he said. That's essentially what DataSift is set up to do, as a social data stream query service that also pulls in feeds from Facebook, LinkedIn, and other services, while also enriching the stream with demographic, geographic, sentiment, and influence data.
[ Even short-form Twitter can provide a window into consumer preferences. Read more in Do Tweets Predict The Future? ]
You can get a sense of what the service can do by signing up for an account on the DataSift website and learning the basics of the service's scripting language, called the curated stream definition language (CSDL), which is used to query and categorize posts. A Web-based dashboard allows you to test your queries for free, although to capture or export the data you need to be a paying customer.
For example, to see posts mentioning President Obama from relatively influential sources, you could use this:
interaction.content contains "obama" AND klout.score > 50
This gives you a filter on the live Twitter feed, so you need to identify a topic people are talking about at the moment you run your query. Halstead said DataSift has also been collecting historical data and is working on a service to allow searching of past tweets and posts.
Halstead said he expects most customers to interact with the service through its application programming interface. "The website is really there to demonstrate what you can get, so you'll know if you want to buy data from us," he said.
Some enterprises may build applications that access DataSift directly, but many others will purchase applications that rely on DataSift as a backend analytic engine, he said. For example, unstructured data analytics specialist Endeca, which was recently purchased by Oracle, is a DataSift application partner.
CEO Bailey said DataSift fills an important niche because "we solve one of the biggest problems facing enterprise, which is how to make money using social data in the enterprise."
Apply advanced analytics to the sales pipeline, Web traffic, and social buzz to anticipate what’s coming, instead of just looking at the past. Also in the new, all-digital issue of InformationWeek: A practical guide to biometrics. Download the issue now. (Free registration required.)