Skip to content

Security Notes

This section gives a few additional security aspects of the Katalogue application.

Read through the Architecture section first to get an understanding for the topology.

Personally Identifiable Information in Katalogue is mainly related to users and user accounts. This data is deleted when the user is deleted from Katalogue, with a few exceptions:

  • Deleted, externally provisioned users from e.g. Microsoft Entra Id is deleted in Katalogue when a sync task is run. However, if the user is assigned to assets in Katalogue, as e.g. owner of the asset, the user is only disabled until all assets have been transferred to another user.
  • Local users must be deleted manually in Katalogue.
  • The changelog table is a complete log of all data changes that ever happened in Katalogue. Deleted users will remain in this table, meaning that deleted users must be manually deleted from this table.
  • The username (with the default attribute mapping to Microsoft Entra Id, this will be the email address) of the user that last made a change to an asset is stored in the modified_by_user_username column in all asset tables. This is not deleted when the user is deleted.

These are the main tables in the Katalogue repository database that store data that might be defined as PII in your organization:

TableDescriptionPII data stored
public.userMain user table with user informationsource id, name, username, email, photo, title, department
public.changelogHistory table with snapshots of all changes to the user tablesource id, name, username, email, photo, title, department
stage.raw_userStage table to handle user syncing. This table is truncated at the beginning of the sync task, but data is kept until the next sync job starts for debugging purposes.source id, name, username, email, photo, title, department
stage.raw_user_group_memberStage table to handle user group membership syncing. This table is truncated at the beginning of the sync task, but data is kept until the next sync job starts for debugging purposes.user source id (or attribute that is mapped to the Katalogue user source id) from Microsoft Entra Id.
public.http_requestLog table to log all http requests to the backend api. Logging to this table is disabled by default.username, IP address

In addition to this, logs may contain PII data. Katalogue rotates the file logs regularly (every 14 days by default) but logs captured by external logging tools or console logs stored in the container service might store PII data for a long time.

Finally, Katalogue may also store PII data if such data has been entered in any of the metadata attributes, like table descriptions, synced with Katalogue, or if someone enters PII data in descriptions in Katalogue. Detecting and handling such data is not considered to be part of the application but should rather be controlled by policy and processes.