Boris Eetgerink

September 22, 2023

Storing calculation results is denormalization too

Database normalization reduces the data redundancy in a (relational) database. When properly done, there is only one source of truth. Database denormalization on the other hand duplicates data. Denormalization can be useful for performance, but makes it more difficult to keep data consistent, because there are more sources of truth.

Recently I had to enter data required to calculate the scores students can get for an exam. This data was based on a formula in a spreadsheet. Simplified it read something like this:

Score = 10 - (NumOfErrors * 0.6)

Here 0.6 is the interesting bit: it determines the difficulty of the exam. The higher the number is, the more difficult the exam and vice versa for a lower number.

The data I had to enter was rather straightforward: for 0 errors, the score is a 10, for 1 error the score is a 9.4 etc. This makes the application really easy and fast: when the student is done, just calculate the number of errors and look up the formula record to obtain the score for the exam.

In my opinion, each formula record is a denormalization of the 0.6 value. So instead of storing each record, the application could've calculated the result of the formula and only the 0.6 value would have to be stored in the database, making data entry much easier.

About Boris Eetgerink

Hey, thanks for reading my blog. Subscribe below for future posts like this one and check my personal site in order to contact me.