Looking forward to the day when my data might be able to cuddle with @pandaproject and provide added value to the user
Published 2011-09-19
UPDATE: The questions I need to answer in order to move forward are:
- Is there a method to serve data and geo streams of json, xml or js for tablets and mobile devices?
- Is there a method more efficient than Caspio, which is cumbersome when creating more than a simple searchable database, and is without an cost-effective API to access the data we upload?
- Is there a method to ensure that we won’t run into third-party API restraints, or in the case of Google spreadsheets and Fusion Tables, potential costs or deprecation?
- Is there a method that allows uploaded data to associated with other data, and updated by staff in an efficient manner?
- Is there a method that allows for the efficient creation of legacy print products.
It seems as though the beta release of the Panda Project in the coming months my provide a clue to the answers?
Upon joining madison.com on Jan. 31, I had a couple open-ended priorities handed to me. To big picture me, there could have been nothing more exciting. To detail-oriented me, there could be nothing more dangerous.
Because when you are asked to package local datasets with an eye toward creating products, and develop systems and workflow to treat news and advertising as data instead of unstructured text, well that mission hits all of my buttons: take an underutilized area, develop a plan, process and system, and do the work.
Nearly eight months later, I'm realizing something important. There is so much data.
Data in Word documents, and in spreadsheets -- Excel, Google, OpenOffice, Lotus 1-2-3? -- and InDesign pages. Data having to do with local non-profits and golf courses. Restaurant listings and dining reviews and things to do. Data is stored on hard drives, and flash drives and CD-Roms and shared drives and MySQL databases and Caspio tables.
All that said, the group of us that have been working together on these initiatives have had success.
Just two months in, I helped the sports department produce it's annual Wisconsin Golf Course Directory with an eye toward a central database that can supply a print product, as opposed to the other way around.
And that test case and proof of concept allowed us to use the same mechanism for the annual Answer Book publication
I feel the publication of the Badgers All-Time Football Scores Database is also a win.
But I look up and I find the Book of Business and the Dining Guide coming down the tracks. And there's a holiday guide, and news applications… Suddenly, like the assignment I was given, the possibilities are endless.
And while I was able to standardize what had been 15 separate databases into one that would power the Answer Book, that is not possible for each of these publications.
No, to pull this off would mean one-off data management tools, one-offs that require separate admins depending on whether a user wants to submit or update the information. Then there is the user side, which requires another view. … Multiplied by 15 to 20 projects…
Well, what would I like? Well, I'm glad you asked.
I'll take a central hub for all the listings and directory information, as well as politicians and public officials so we can get that data out of docs and excel files, etc. And perhaps certain data is associated with a publication, and then we lay an interface on top of that so I can pull the data I need for a publication… Maybe even choose table or digest format?
Well, to get there, I see some obvious solutions.
- Keep building one offs until I stumble upon a hidden feature that solves all my problems.
- Continue to build off the django poll tutorial I finally finished, and hope I can learn enough to build the master of all databases. I want to do this anyway, though from experience I know I need to build a one-off and not a masterpiece
- Install OpenBlock on an AWS machine and learn to simply manipulate the database structures.
Short of those solutions, I'm eager and excited to see what the PANDA Project comes up with. A Knight News Challenge winner led by Brian Boyer and Christopher Groskopf, PANDA "wants to be your newsroom data appliance. It provides a place for you to store data, search it, and share it with the rest of your newsroom."
So, now comes the dilemma. Put in the time to learn how to build a system, or make due until what could be a "Pro" solution comes along…? And then there are the deadlines for niche publications...
Tell you what, if you too have found the need for a data solution, I'd recommend filling out the PANDA Project's survey. Maybe then you and I can chat when all our data is structured and in one central location and we are happy managers of localized structured data sets that are adding value and utility for the user.