Learn how to Work Smarter

Tuesday, December 8, 2009

Spreadsheets are not databases.

Normalization -- what is it, and why is it important?

What:
A series of rules -- that's what database normalization boils down to.  I have had a terrible time trying to distill the "why" of database normalization, largely because I get all excited about the "what!"  So, in more concise terms than I can muster, enter the Wikipedia:
http://en.wikipedia.org/wiki/Database_normalization

Why:
Databases start small, often with an Excel spreadsheet.  But -- this is important --  they're designed to get big.  Because many of us manage datasets with more than 60,000 records -- or will at some point.

How do these rules help big data sets?  Two ways:
  • managing data "types"
  • enforcing rules within a row, or record
  • enforcing rules between rows and
A data type is a date, a currency amount, etc; think of Excel's "format" function and you get the idea.  This rule persists throughout the column

Three levels of normalization:
http://www.troubleshooters.com/littstip/ltnorm.html

More depth on our target level of normalization:
http://en.wikipedia.org/wiki/Third_normal_form

Important translations:
Column => Field
Row => Record
So that's where we'll begin.  I'm going to go hop on a plane in a minute.

No comments:

Post a Comment

Home | My Schedule (Free/Busy) | Professional CV | Learn how to Work Smarter

A Little More Background

Friends & Followers