11 March 2009

Freebase, open government, and enumerations

I'm preparing a short series of articles about Freebase, but Raymond Yee had a question about something I was working on over the weekend, so here's a quick hint to help him along.

What he calls "keys" are called "enumerated properties" in the Freebase documentation and there's an article on how to set them up. Unfortunately, the schema editor was broken when I was working on the National Register of Historic Places database schema, so I had to resort to reverse engineering things from the Explore view (accessible by pressing F8 on any page and scrolling to the bottom of the page) and then modifying the schema's property type by hand using their MQL query language. You can see the end result in the schema where item_number is typed as an enumeration.

There's also a good article on how to create a URL template that I used successfully to link to the original application submissions. For the Congressional Bioguide, it can be used to link back to the original biography.

Coincidentally and independently from Raymond's project, I was actually working on loading up all the Congressional Bioguide ID's last weekend because they are used in the XML form of legislation on THOMAS, which is run by the Library of Congress. I decided to take a slight detour to write a little name parser and Freebase name queryer in Python, so haven't actually gotten around to loading the IDs yet. One of the biggest problems in working with Freebase is reliably resolving personal names. They typically only have the main name that was used as the Wikipedia article name. There's really no telling what name form the article's editors will have chosen and even though the full name and some aliases are often identified in the opening sentence of the article, Freebase doesn't import this information from Wikipedia.

No comments: