Data: first and last names from the US Census
I’ve found myself in need of a name distribution for a few projects recently, so I thought I would post it here so I won’t have to go looking for it again. The data is available from the US Census...
View ArticleConference: Web2 Expo SF
I gave a talk called A Data-driven Look at the Realtime Web Ecosystem at the Web2Expo SF conference in May in San Francisco. I attempted to highlight some of the interesting facets of the bit.ly data...
View ArticleWeb 2.0 Summit: The Secrets of our Data Subconscious
I just got home from the Web 2.0 Summit, a three-day conference that was packed with announcements, interesting ideas, and good conversations. My short talk, The Secrets of our Data Subconscious,...
View ArticleHey Yahoo, You’re Optimizing the Wrong Thing
I was visiting my grandparents yesterday, and my grandfather asked for help e-mailing an article to some of his friends. I asked him to show me how he normally writes an e-mail, and taught him the...
View ArticleBitly Social Data APIs
We just released a bunch of social data analysis APIs over at bitly. I’m really excited about this, as it’s offering developers the power to use social data in a way that hasn’t been available before....
View ArticleNeed Data? Start Here
Data scientists need data, and good data is hard to find. I put together this bitly bundle of research quality data sets to collect as many useful data sets as possible in one place. The list includes...
View ArticleStartups: How to Share Data with Academics
This post assumes that you want to share data. If you’re not convinced, don’t worry — that’s next on my list. You and your academic colleagues will benefit from having at least a quick chat about the...
View ArticleData Engineering
Data engineering is when the architecture of your system is dependent on characteristics of the data flowing through that system. It requires a different kind of engineering process than typical...
View ArticleWhat Mugshots Mean For Public Data
The New York Times has a story this morning on the growing use of mugshot data for, essentially, extortion. These sites scrape mugshots off of public records databases, use SEO techniques to rank...
View ArticlePlay with your food!
I spent a few minutes this week putting together a quick script to pull data from the Locu API. Locu has done the hard work of gathering and parsing menus from around the US and has a lot of...
View Article
More Pages to Explore .....