Why Is Good Business Information So Scarce?

I recently came across this O’Reilly post about CrunchBase, the open database of information on startup companies, asking whether CrunchBase will remain free in the long term.  While the post itself is interesting, the part that puzzles me is why there is so little business-oriented information freely available out there.  Data as a service has been generating a lot of excitement recently, and I think it’s well warranted.  However, the only prominent sources of open business information are the SEC’s Edgar database, LinkedIn, and CrunchBase.  After that, the field gets very thin.  Considering how much effort companies put into business intelligence and competitive intelligence, it seems like there should be a great profit motive for someone to provide a deeper business information layer.  So I don’t really understand why we’re in this situation, but it does seem like a big opportunity.

These days, most companies are still creating expensive in-house data sets for purposes like market analysis and competitive intelligence. Data providers like Thomson Reuters and CapitalIQ do the same thing and charge through the nose for it.  As a result, big companies end up paying a lot for often not very impressive information. Meanwhile, the rest of the market makes do with sources like Google that just aren’t intended for business research.

The open source analogy

The current landscape reminds me of the days when every big company did custom development for its own IT applications, even though each company’s payroll or data processing system did basically the same thing. That’s good for consultants and developers, but it makes process improvement very expensive for corporations. Similar walled gardens still dominate the business information market.  We need to move to a mindset closer to the open source world. In the last ten years, open source software has become a very competitive option compared to commercial software.  While the core movement is still based on developers writing code based on their personal interests, an important catalyst for the movement has been the rise of companies that add a layer of support and commercialization on top of the core technology. Open source is now a lively market, not just a movement. What’s interesting is that users of open source, from Google and Facebook to NASA, probably contribute more to it now than the companies that sprung up to commercialize it, like Red Hat.  The crowd is contributing to tools that would often be prohibitively time-consuming to build internally or too expensive to license from a vendor.  All the users benefit as a result.

The Wikipedia of business?

Wikipedia and other crowdsourced sites like Wikinvest (a wiki for public company information for investors) are a bit analogous to the open source movement. People make contributions for fun or or to burnish their reputations, not for monetary incentives. The results have been amazing – the English language Wikipedia alone has over 3 million articles. But there are also limits to the phenomenon. Growth in articles is estimated to have peaked in 2006, and you can tell that sometimes articles of marginal interest to the community languish or are even deleted. And as the base of articles becomes more static, it will be interesting to see whether the relatively mundane task of updating all those articles still motivates people.

Wikinvest is another amazing tool, but also one with limits. Interesting companies are updated frequently, but articles on, say, pulp and paper companies tend to be out of date. (As an aside, I get the impression that contributors to Wikinvest are overwhelmingly undergraduate business majors. I wonder what the incentive is. Do corporate recruiters look at the site?)

So there are limits to the open source model for data, particularly commercial data like company information. Somehow writing up a stock analysis doesn’t have the same craft appeal as working on a piece of software.

The commercial data layer

But I do think there’s potential for a real commercial data layer, crowdsourced but with profit and cost savings motives as a driver. There are myriads of players who need information on topics like companies and industries (or who want to sell it), and right now they’re all creating expensive, in-house data sets.  Moreover, much of the data is inaccurate or rapidly goes out of date because maintaining that volume of information requires massive scale. It’s a lot like the software world before first off the shelf and then open source became prevalent.  There are huge economies of scale to be realized with a more consolidated data layer.

Facebook owns the social graph. FourSquare or Yelp will likely do the same with local business information (although Groupon could be well positioned to horn in on it). Right now, no one has claimed pole position in creating this data layer for commercial purposes. There are also lots of aspirants to that position: Thomson Reuters, Capital IQ, Bloomberg, and on and on. I just don’t think any of them are taking the right approach. We need a platform, not a silo.

Once commodity information on companies becomes more widely available, it will be interesting to see what kinds of applications people can build on that information.  Tools like Google Finance, which is quite amazing for a free service, are just the tip of the iceberg. Imagine if you never had to do another bubble chart yourself again! Freeing people from basic research will make it much easier to do deep, thoughtful business analysis instead of spending your time trolling for information online. Basic business information needs to become an open commodity.

  • http://www.mymailmarket.be Geert Roete

    It’s likely that the different silos charge money for distributing data towards clients. How would you translate this to creating one platform that gets data from a myriad of services ? I like your idea though, good thinking!

  • http://www.brekiri.com/ Greg4

    We’re working on a product that will provide a more integrated view, and it should be out in beta within a couple of months. I think the business information market is still a rather inefficient one, surprisingly. Providers are making tons of money on commodity information, so I think the market is ripe to be disrupted. Clients will benefit, but competition will definitely get a bit steeper.

