Network Computing is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Dirty Data

I'm currently working on a business intelligence review and the first test began to run into issues as soon as it started trying to make sense of the data in NWC Inc.
Dates that are strings is bad and so is allowing just any old data into the database. BI tools like nice clean data to work with and my dirty data is driving tools crazy.

So I spent the morning cleaning up the data, including adding a new column to handle real dates. Of course then it was necessary to change the widget ordering code to make certain the new dates were entered into the database. And there's always that moment of "should I really hit enter on this SQL statement?" Nerve wracking, to say the least!

Then I added some indexes to some of the commonly used fields for reporting, because churning through more than half a million records can be a pain when you're pulling it across the network and trying to load it into Internet Explorer. I think it's helping... cause I"m finally able to pull data that makes some amount of sense:

YEAR TOTAL REVENUE
1998 4,838,816.55
1999 4,861,384.30
2000 4,783,757.00
2001 4,813,386.75
2002 5,969.20
2003 47,345,416.00
2004 29,382,997.00
2005 19,190,789.25

Wow. Wonder what happened in 2002 that we only made $5,969? This year's looking good though, isn't it? Too bad it isn't real money, I could buy this, fix it up and still have money left for other things, like gaming books.

  • 1