Skip to main content
Blog

Data modelling guidelines

  • 21 January 2019
  • 2 replies
  • 355 views
Data modelling guidelines
Data modeling was originally intended as a tool for database design, but has become a fundamental requirements technique for business analists, as it is relevant for anyone working with data structures, application logic, user interfaces and business processes.



The guidelines in this document aim to ensure that Software Factory developers set up high quality data models. High quality means:




  1. Consistent
  2. Logical
  3. Well structured
  4. Unambiguous

This document has guidelines for the following subjects:


  • Data modeling
  • Diagrams in the Software Factory

Data modeling



To create well structured data models, a distinction is made between:




  • Strong entities
  • Weak entities

An entity is strong when it’s existence does not depend on the existence of any other entity in a database. A weak entity depends on a strong(er) entity for its existence.



In the following section Strong and Weak entities form the basis of most of the Guidelines.



General guidelines


  1. The name must be self-explanatory
  2. Names singular
  3. Names lowercase
  4. No abbreviations

  • Unless platform limits are exceeded

  1. Divide a name into small words (subnames), separated by an underscore.

Good


  • sales_order_line

Not good


  • salesorder_line
  • salesorderline

No meta information in names



Strong Entity


  1. Has 1 primary key column
  2. Does not have foreign key columns in the primary key

Weak Entity


  1. Has more than 1 primary key column
  2. The primary key columns are ordered from strong to weak
  3. The last primary key column is not a foreign key, the other primary key columns are
  4. Has exactly 1 primary column more than his ‘parent’
  5. Has the name of its ‘parent’ plus an addition

One to One Relationship (1 : 0..1)


  1. Target table has his own name. This means that his name is not derived from the ‘parent’ entity
  2. The number of primary key columns is equal to the ‘parent’
  3. All primary columns are also foreign keys

Link Table


  1. Link tables have all primary key columns from both source tables
  2. All primary columns are also foreign keys
  3. Have the name of one table combined with the name of the other table

  • (If another name is more suitable, then this table should probably be a Strong or Weak Entity)

Foreign Key Relation


  1. Each foreign column has its own reference

  • So if a table has 5 foreign key columns, there must also be 5 references
  • A column may be part of multiple references

  1. The database check must be enabled for every reference

  • There are situations where the check is not allowed, for example with a reference to a view

Recursive Relation


  1. Last Foreign key column has a new name

  • The column does not have a derived name

Columns


  1. The name must be self-explanatory
  2. Names lowercase
  3. No meta information in names
  4. Divide a name into small words (subnames). Place between the words an underscore.
  5. No abbreviations

Exceptions:


  • Nr/No
  • Id
With non-key columns no table name in the name



Primary Keys


  1. Name Primary key column is table name + _id
  2. Type of column is preferably an identity with a BIGINT as the data type

  • Only primary key columns that are not foreign keys may be an identity column

Foreign Keys


  1. The referring foreign key column has the same name as the primary key of the parent table

  • If the corresponding reference has an ‘Add’, then this must also be added in front of the column name

Domains


  1. No DTTP in the name
  2. No length in the name
  3. No meta information in names
  4. Primary key columns which are not foreign key columns have the same domain name as the column.

Datatypes


  1. Use DATETIME2, not DATETIME
  2. Use NVARCHAR unless the character set needs to be restricted. Then a VARCHAR can be used.
  3. Identity: use INT or BIGINT
  4. Use NUMERIC for numbers with digits after the decimal point
  5. Don’t use CHAR, FLOAT, NCHAR without a specific reason

Example





Software Factory Diagrams



In this section the Guidelines can help you set up Diagrams and Designs.



Guidelines Diagrams


  1. Name must describe a process

Guidelines Design


  1. Foreign key reference: Place parent more on the left side, and child entities more to the right.
  2. Inheritance: Place parent more to the upper side, child entities more to the bottom.
  3. Strong-Weak entities: Place parent more to the upper side, child entities more to the bottom.

2 replies

Userlevel 4
Badge +11

I would like to spark a discussion regarding the guidelines that describe a weak entity having more than one primary key column. Why does Thinkwise prescribe this guideline? In my opinion, this contradicts the standard guidelines for normalization and primary keys.
 

Weak Entity:

  1. Has more than one primary key column
  2. The primary key columns are ordered from strong to weak
  3. The last primary key column is not a foreign key; the other primary key columns are
  4. Has exactly one primary key column more than its 'parent'
  5. Has the name of its 'parent' plus an addition


In my opinion, a single primary key makes indexing and query optimization simpler and more efficient, which can lead to better performance when executing searches and joins.

 

 

The second select statement JOIN on the single PK is more #low-code

I look forward to your comments.

Hi Dennis,

Thanks for your question; it’s always an interesting topic. The main purpose of using composite keys for weak entities is to introduce structure to your data model. When all tables are given a primary key with just one column, the relationships between tables become much less clear.

This clarity extends to queries as well. Composite keys help make the relationships between joined tables more obvious, allowing you to quickly see if a join is correct and reducing the likelihood of errors.

Furthermore, if you decide to use semantic keys, you’d likely need to use composite keys to enforce uniqueness. This would alter the structure of your data model and require adjustments to existing queries.

While single-column indexes are faster than multi-column indexes, the difference is often negligible. In 99.9% of cases, you won’t notice it. If performance does become an issue, you can always create an additional index.

Regarding normalization: Foreign keys inherently introduce some level of redundancy.

Reply