Skip to content

Security

This section gives an overview and summary of the security aspects and measures taken to secure the Katalogue application.

Read through the Architecture section first to get an understanding for the topology.

Kayenta Consulting AB, the data & analytics company behind Katalogue, takes security seriously. We have designed Katalogue to be secure from the ground up and we have made the following security related design choices:

  • Katalogue only reads and stores metadata from source systems - never actual business data.
  • Katalogue is a self-hosted application - you have full control and ownership of the data, it never leaves the premises (if you don’t want to) and can be easily integrated in existing security processes.
  • Katalogue customers get full access to the source code - so you can review it yourself.

The following measures should be taken to make Katalogue as secure as possible in a production scenario. Coincidentially, most of these actions also reduce the management burden for Katalogue admins.

Security Hardening Checklist:

  1. Deploy Katalogue behind a VPN/firewall, never expose it to the internet.
  2. Ensure all communication between services is enforced to HTTPS.
  3. Use externally provisioned users provisioned through user groups, not local users nor individually added external users. This relieves Katalogue from storing user account passwords and automates user role assignments.
    1. Go through the security hardening steps for the app registration in Azure.
    2. Disable local user authentication (Settings -> Authentication and uncheck Enable Local Authentication). This disables the feature to create and use local user accounts in Katalogue.
    3. If local user authentication cannot be disabled, make sure to update the default admin user’s password.
  4. Restrict the user account permissions for users used in datasource connections to ingest metadata from source systems to only have read access to the required resources/tables.
  5. Enable password manager integration for datasource connections and store all datasource connection passwords there. This relieves Katalogue from storing user account passwords.
  6. Inject all configuration parameters that are secrets as environment variables/secrets in the startup phase of Docker containers. Do not store secrets in the appsettings.json config file nor inject them as environment variables during the Docker build stage, as this will leave the secrets exposed in the built Docker image.
  7. If the REST API service is enabled, consider using an externally provided signing key for access token signing. This relieves Katalogue from storing the private key.
  8. Set the ENCRYPTION_KEY config variable to a long, cryptographically random string.
  9. Configure CORS properly.
  10. Limit the number of admin users to as few as possible.

If all of the steps above are followed, the only secrets Katalogue need to handle is the following:

  • ENCRYPTION_KEY - The Encryption key used to encrypt JWT cookies for authenticating requests from the frontend service.
  • REPOSITORY_PASSWORD - The repository database user password.
  • OIDC_CLIENT_SECRET - The Microsoft Entra Id app registration’s client secret.

This section covers a number of technical aspects to give better insight into what has been done to secure the application. All authentication, authorization and user input validation is done on the backend, in some cases also on the frontend for better UX.

The backend service uses access tokens and refresh tokens to authenticate requests from the frontend service.

  • Access token and refresh token TTL is configurable.
  • The tokens are JWTs (JSON Web Tokens) in encrypted cookies.
  • CSRF (Cross-Site Resource Forgery) protection is included.

OAuth2 client credentials flow with encrypted JWT access tokens.

The backend service authorizes users to access resources by the means of user roles.

The backend service authorizes users to access resources by the means of OAuth2 scopes.

  • Secrets are never stored in plain text anywhere.
  • Secrets are never logged anywhere, not even in encrypted/hashed state.
  • Secrets are never sent anywhere, they are only handled internally in the backend service. The only exception is when they are sent to the backend from frontend on input.
  • Local user passwords are one-way hashed with the Nodejs bcryptjs library.
  • Datasource connection passwords and other secrets that need to be retrievable are encrypted with the built-in Nodejs crypto library. It uses the aes-256-cbc algorithm in combination with an encryption key that need to be provided as a configuration parameter/secret to Katalogue.
  • All random strings used in a security context is generated with cryptographically safe algorithms.

Security headers for all HTTP requests to the backend service (both requests from the frontend service and via the REST API service) are set by the Nodejs helmet package. Katalogue uses the default settings.

Rate limiting is in the backlog, but is currently not implemented.

Logs are an important source of information to discover and possibly prevent a potential security breach, but can at the same time be a security leak in itself.

On the former note, see the Logging for configuration options and what is logged and where.

On the latter note, Katalogue never logs secrets. They are omitted and replaced with a placeholder string.

Error messages presented to users are designed to leave as little information about the internals of the application as possible. Error messages related to authentication and authorization errors are brief, and does not give details on exactly what went wrong (e.g. same message if the user does not exist as if the user is not properly authenticated to access a resource).

All user input is sanitized to prevent injection attacks.

Katalogue does not collect telemetry data.

Personally Identifiable Information in Katalogue is mainly related to users and user accounts. This data is deleted when the user is deleted from Katalogue, with a few exceptions:

  • Deleted, externally provisioned users from e.g. Microsoft Entra Id is deleted in Katalogue when a sync task is run. However, if the user is assigned to assets in Katalogue, as e.g. owner of the asset, the user is only disabled until all assets have been transferred to another user.
  • Local users must be deleted manually in Katalogue.
  • The changelog table is a complete log of all data changes that ever happened in Katalogue. Deleted users will remain in this table, meaning that deleted users must be manually deleted from this table.
  • The username (with the default attribute mapping to Microsoft Entra Id, this will be the email address) of the user that last made a change to an asset is stored in the modified_by_user_username column in all asset tables. This is not deleted when the user is deleted.

These are the main tables in the Katalogue repository database that store data that might be defined as PII in your organization:

TableDescriptionPII data stored
public.userMain user table with user informationsource id, name, username, email, photo, title, department
public.changelogHistory table with snapshots of all changes to the user tablesource id, name, username, email, photo, title, department
stage.raw_userStage table to handle user syncing. This table is truncated at the beginning of the sync task, but data is kept until the next sync job starts for debugging purposes.source id, name, username, email, photo, title, department
stage.raw_user_group_memberStage table to handle user group membership syncing. This table is truncated at the beginning of the sync task, but data is kept until the next sync job starts for debugging purposes.user source id (or attribute that is mapped to the Katalogue user source id) from Microsoft Entra Id.
public.http_requestLog table to log all http requests to the backend api. Logging to this table is disabled by default.username, IP address

In addition to this, logs may contain PII data. Katalogue rotates the file logs regularly (every 14 days by default) but logs captured by external logging tools or console logs stored in the container service might store PII data for a long time.

Finally, Katalogue may also store PII data if such data has been entered in any of the metadata attributes, like table descriptions, synced with Katalogue, or if someone enters PII data in descriptions in Katalogue. Detecting and handling such data is not considered to be part of the application but should rather be controlled by policy and processes.