Existing noninferential, formatted data systems provide users with tree-structured files or slightly more general network models of the data... --EF Codd
I've skipped over discussions of data theory -- but there are a couple principles of data theory that are important. And, since I'm not a DB guru, correct me as we go. I find errors with previous projects all the time.
E.F. Codd, mathamatician, introduced normalization in 1970. Normalization has a lot of interesting theoretical underpinnings, but, lest I lose the plot, in our frame of reference it's just a way to organize data to make corruption unlikely. Here are two readable, more recent docs:
http://www.troubleshooters.com/littstip/ltnorm.html (wish I'd followed Litt's "Additional normalization tips" first time 'round! Next entry.)
http://dev.mysql.com/tech-resources/articles/intro-to-normalization.html
We could go into 3NF, or BCNF (aka 3.5NF, or Heath normal form), but ... you, in the back! Wake up...
Data integrity is why we're doing this - set 3NF as a goal. The only thing a database does FOR you is say "no" a lot -- like a good parent it sets the rules. And those guidelines should keep you from reporting an 8260 as an 8260B; from reporting a 2009 water elevation as a depth below an outdated 1993 measuring point elevation; stuff like that. Okay, silly example time.
No comments:
Post a Comment