29 March 2010

Thoughts on Metaweb business strategy

Metaweb hasn't announced its new strategy yet, but supposedly will soon, so I'm writing down my suggestions in advance, so we can compare and contrast when it appears. Just to be clear, this is not based on any insider knowledge of any kind and does not represent the views of Metaweb Technologies Inc.

The Metaweb (or Freebase) business strategy has always been a bit of an enigma. They said they were building "The World's Database" and would charge for something later, although it hasn't be clear what.

So what would I do? Here are some thoughts (on how to develop the strategy, rather than the strategy itself):
  • Hire (or promote) a Director of Product Management - Not because that's what I do, but because, while they've had good product management in individual areas like their custom app dev environment, they've been hugely stovepiped and don't appear to have an overall product strategy. The product strategy is clearly going to be driven by the executive team and board in a startup, but someone has to be in charge of focusing the discussion in a way that will produce a concrete and implementable strategy, implementing that strategy, and then revising it based on real world customer feedback.
  • Focus - They've done everything from their own database engine and query language (arguably a competitive differentiator), to their own bulletin board system (definitely not!) to a complete development environment with its own version control. A startup can't afford the same expansive vertical integration strategy that an IBM or HP pursues.
    Focus is key. They need to focus only on those things which are absolutely critical to success and survival. The generous initial funding ($57M to date with a $42M tranche two years ago), may have actually been a curse in this regard.
  • Holistic view - Metaweb appears to consider their various software components, their data integration efforts, the resulting data, their volunteer community, and their (potential) commercial customers as independent things which can be optimized separately when they're all inextricably linked, to one degree or another, to each other. It doesn't matter how pretty widgets are if, when I link to Boston from my family-oriented site, the default page shows it as the filming location for the porno flick Slave Workshop Boston.
  • Customer Engagement - The only place to tell whether you're winning, losing, or standing still is in the marketplace. More customer involvement is critical. Both to refine product & service requirements as well as to generate design wins that can be used for marketing.
  • Developer Ecosystem - A vibrant developer community is critical to success. Building this means not only providing the right libraries and tools, but recruiting the developers, training them, and making them successful. This doesn't mean huge corporate machinery is required, but it needs to be a dedicated, ongoing goal for someone. If you look at successful developer programs, non-code assets and processes are at least as critical as the raw developer tools. The business side can't be ignored either.
  • Evangelism - Most or all of the marketing staff was apparently let go in late 2008/early 2009 and marketing seems to have been an occasional, part time effort of people with other jobs since then. That doesn't work. Metaweb is, at its core, an engineering company and most engineers have a severe allergy to marketing, but, having done a lot of both marketing and engineering, I know each is critical. They have a technical product set with new concepts in an emerging market, so it's going to be a very technical sell, but it's still marketing. Someone needs to have it as their real job (and get measured on it).
    • Standards strategy - Metaweb has never said anything about what their standards strategy is or how they see their technologies relating to thos of the W3C. There's certainly a lot to dislike about some of the W3C choices, but an ugly standard is still a standard. Metaweb did implement RDF publishing support last year, but they need to say more about their long term strategy.
    • W3C/Semantic web community - Perhaps the W3C is just naturally opposed to any type of commercialism, but establishing a better relationship would be useful to both parties. Having someone of Tim Berners-Lee's visibility diss you at a venue as prominent as TED 2009, where he completely glossed over Freebase's role as one of the largest publishers of linked data, isn't good.
    • Open Source - The company has a number of open source projects, but doesn't talk much about its open source strategy. At the very least, it should claim credit for the things it does and have an easily accessible list of open source projects it contributes to.
  • Brand - They've finally realized just how misguided the choice of Freebase was (it's the only Google Alert where I need to add -c*caine to the search terms) and appear to be backing away from that brand name, as well as its associated garish orange livery and flag waving rhino logo. While there's a good case for using a single brand for both a startup and its products, I'm not sure Metaweb is the right brand since it has generic meanings and usages as well. I'd investigate establishing a new brand for the product family.
  • Human/machine synergy - I put this last, because it's not a short-term thing, but it represents huge potential for the future, in my opinion. It's an area that Metaweb is uniquely positioned to exploit, which makes it all the frustrating that they haven't made more progress on this front. The synergy between machine-based data reconciliation processes and crowd-sourced processes could create a virtuous feedback loop where machines do the drudge work and humans decide the edge cases, in the process providing training data to refine the classifiers and info extraction algorithms. They've only taken the smallest baby steps so far, but I believe this area has huge potential for those who learn to exploit this synergy effectively.

28 March 2010

Freebase Gridworks data curation and cleanup tool

I've been alpha testing the Freebase Gridworks tool from Metaweb, but haven't been able to talk about it until now. Since they just announced it, I guess it's no longer a secret.

Research scientist David Huynh has been interested in collective data operations since his days at the MIT CSAIL Simile project. You can see collective editing in this 2007 Potluck screencast. Jon Udell called this "stunning." After David moved to Metaweb, his 2008 Parallax demo showed the power of collective operations for browsing Freebase data (and UCG's DERI group forked a SPARQL version called SParallax).

The Gridworks tool is another riff on that same collective operations theme, but this time focused on data cleanup and reconciliation rather than mashups or browsing. There's a lot more to it than what you see in the screencasts (and, naturally, some limitations which are glossed over as well), but while it's still in testing I'll reserve any detailed discussion of features. Suffice it to say though, that the anticipatory buzz in the Twitter-sphere is justified. What remains to be seen is how well they'll follow through on completing the tool, as well as integrating it with the various types of data sources & sinks which are of interest to users.

From a selfish point of view, I'd like to see people use tools like this to contribute to the availability of cleaned up public data sets rather than just using it to clean their private data silos. Of course, convincing people to do that is a much bigger problem -- one which the whole Linked Data / Semantic Web community has yet to come up with a compelling answer for.