Helm configuration#

Gafaelfawr is configured as a Phalanx application, using the Helm chart in the Phalanx repository. You will need to provide a values-environment.yaml file for your Phalanx environment. For examples, see the other values-environment.yaml files in that directory.

In the below examples, the full key hierarchy is shown for each setting. For example:

config:
  cilogon:
    test: true

When writing a values-environment.yaml chart, you should coalesce all settings so that each level of the hierarchy appears only once. For example, there should be one top-level config: key and all parameters that start with config. should go under that key.

You should also read the Gafaelfawr application documentation. In particular, when bootstrapping a new Phalanx environment, see the Gafaelfawr bootstrapping instructions.

Subdomains#

By default, Gafaelfawr only supports cookie-based authentication to a single domain. The base hostname of the Rubin Science Platform environment is the only host to which any authentication cookies will be sent by user browsers, and therefore only ingresses for that hostname can use cookie authentication.

Anonymous ingresses and ingresses that only allow token authentication via the Authorization header can use any hostname, at least from Gafaelfawr’s perspective.

Optionally, Gafaelfawr can extend cookie authentication to subdomains of the base hostname of the Science Platform environment. To do this, set the config.allowSubdomains configuration option:

config
  allowSubdomains: true

With this configuration cookies will also be sent to any subdomain.

For example, if the base hostname of the Science Platform environment is rsp.example.com, by default an ingress at portal.rsp.example.com cannot use cookie authentication. If config.allowSubdomains is set to true, cookie authentication will work for portal.rsp.example.com. It will continue to not work for portal.example.com or any other hostname that is not a subdomain of rsp.example.com.

Warning

This option is only safe to enable if every web service hosted below the base hostname of the Science Platform environment is managed by Gafaelfawr. When this option is enabled, any web server at any hostname at a subdomain will receive the user’s authentication cookies by default, and can use those cookies to impersonate any user that visits that web server. Gafaelfawr will filter out those cookies (see Header filtering), so any service behind Gafaelfawr will not be able to steal cookies in this way, but Gafaelfawr cannot do anything about web services that are not protected by it with at least an anonymous ingress.

Therefore, before enabling this setting, ensure that you have tight control over creation of new DNS entries in any subdomain (even multiple levels down) from the baes hostname of the Science Platform environment, and ensure that all of those hostnames point to Phalanx-managed Kubernetes services for that Science Platform environment.

Environent URLs and storage#

Base internal URL#

Gafaelfawr needs to know the internal cluster DNS domain when creating Ingress resources from GafaelfawrIngress resources. By default, Gafaelfawr assumes that the cluster DNS domain is svc.cluster.local and the address to Gafaelfawr can be constructed by adding the name of the service and the name of the Gafaelfawr deployment namespace to the front of that domain. If your cluster sets it to something else (by using the --cluster-domain flag, for example), or if you are running Gafaelfawr in a vCluster but running the ingress outside of that vCluster, you will need to override the internal URL to Gafaelfawr by setting config.baseInternalUrl.

config:
  baseInternalUrl: "http://gafaelfawr.gafaelfawr.svc.example.com:8080"

The first component of the host name is the name of the Service resource and therefore must be gafaelfawr. Always use a port of 8080.

Database URL#

Set the URL to the PostgreSQL database that Gafaelfawr will use:

config:
  databaseUrl: "postgresql://gafaelfawr@example.com/gafaelfawr"

Do not include the password in the URL; instead, put the password in the database-password key in the Vault secret. If you are using Cloud SQL with the Cloud SQL Auth Proxy (see Cloud SQL), use localhost for the hostname portion.

Alternately, if Gafaelfawr should use the cluster-internal PostgreSQL service, omit the config.databaseUrl setting and instead add:

config:
  internalDatabase: true

This option is primarily for test and development deployments and is not recommended for production use.

To enable database schema creation or upgrades, add:

config:
  upgradeSchema: true

This will enable a Helm pre-install and pre-upgrade hook that will initialize or update the database schema before the rest of Gafaelfawr is installed or updated. This setting should be left off by default and only enabled when you know you want to initialize the database from scratch or update the schema. When updating the schema of an existing installation, all Gafaelfawr components should be stopped before syncing Gafaelfawr. See the Phalanx documentation for step-by-step instructions.

Cloud SQL#

If the PostgreSQL database that Gafaelfawr should use is a Google Cloud SQL database, Gafaelfawr supports using the Cloud SQL Auth Proxy via Workload Identity.

First, follow the normal setup instructions for Cloud SQL Auth Proxy using Workload Identity. You do not need to create the Kubernetes service account; two service accounts will be created by the Gafaelfawr Helm chart. The names of those service accounts are gafaelfawr and gafaelfawr-operator, both in Gafaelfawr’s Kubernetes namespace (by default, gafaelfawr).

Then, once you have the name of the Google service account for the Cloud SQL Auth Proxy (created in the above instructions), enable the Cloud SQL Auth Proxy sidecar in the Gafaelfawr Helm chart. An example configuration:

cloudsql:
  enabled: true
  instanceConnectionName: "dev-7696:us-central1:dev-e9e11de2"
  serviceAccount: "gafaelfawr@dev-7696.iam.gserviceaccount.com"

Replace instanceConnectionName and serviceAccount with the values for your environment. You will still need to set config.databaseUrl and the database-password key in the Vault secret with appropriate values, but use localhost for the hostname in config.databaseUrl.

As mentioned in the Google documentation, the Cloud SQL Auth Proxy does not support IAM authentication to the database, only password authentication, and IAM authentication is not recommended for connection pools for long-lived processes. Gafaelfawr therefore doesn’t support IAM authentication to the database.

Redis storage#

For any Gafaelfawr deployment other than a test instance, you will want to configure persistent storage for Redis. Otherwise, each upgrade of Gafaelfawr’s Redis component will invalidate all of the tokens.

By default, the Gafaelfawr Helm chart uses auto-provisioning to create a PersistentVolumeClaim with the default storage class, requesting 1GiB of storage with the ReadWriteOnce access mode. If this is suitable for your deployment, you can leave the configuration as is. Otherwise, you can adjust the size (you probably won’t need to make it larger; Gafaelfawr’s storage needs are modest), storage class, or access mode by setting redis.persistence.size, redis.persistence.storageClass, and redis.persistence.accessMode.

If you instead want to manage the persistent volume directly rather than using auto-provisioning, use a configuration such as:

redis:
  persistence:
    volumeClaimName: "gafaelfawr-pvc"

to point to an existing PersistentVolumeClaim. You can then create that PersistentVolumeClaim and its associated PersistentVolume via any mechanism you choose, and the volume pointed to by that claim will be mounted as the Redis volume. Gafaelfawr uses the standard Redis Docker image, so the volume must be writable by UID 999, GID 999 (which the StatefulSet will attempt to ensure using the Kubernetes fsGroup setting).

Finally, if you do have a test installation where you don’t mind invalidating all tokens whenever Redis is restarted, you can use:

redis:
  persistence:
    enabled: false

This will use an ephemeral emptyDir volume for Redis storage.

Token lifetime#

Change the token lifetime by setting config.tokenLifetime. The default is 30 days.

config:
  tokenLifetime: 23h

Supported interval suffixes are w (weeks), d (days), h (hours), m (minutes), and s (seconds). Several values can be specified together. For example, 1d6h23m specifies a token lifetime of one day, six hours, and 23 minutes.

Authentication provider#

Configure GitHub, CILogon, or OpenID Connect as the upstream provider.

GitHub#

config:
  github:
    clientId: "<github-client-id>"

using the GitHub client ID from GitHub.

When GitHub is used as the provider, group membership will be synthesized from GitHub team membership. See Groups from GitHub for more information.

CILogon#

config:
  cilogon:
    clientId: "<cilogon-client-id>"

using the CILogon client ID from CILogon.

CILogon support assumes that COmanage is being used as the identity management system. Additional information about the authenticated user will be obtained from LDAP (see LDAP).

CILogon has some additional options under config.cilogon that you may want to set:

config.cilogon.loginParams: A mapping of additional parameters to send to the CILogon authorize route. Can be used to set parameters like skin or selected_idp. See the CILogon OIDC documentation for more information.
config.cilogon.enrollmentUrl: If a username was not found for the CILogon unique identifier, redirect the user to this URL. This is intended for deployments using CILogon with COmanage for identity management. The enrollment URL will normally be the initial URL for a COmanage user-initiated enrollment flow.
config.cilogon.usernameClaim: The claim of the OpenID Connect ID token from which to take the username. The default is username.

Generic OpenID Connect#

Gafaelfawr should be able to support most OpenID Connect servers as sources of authentication. This support has primarily been tested with Keycloak.

config:
  oidc:
    clientId: "<oidc-client-id>"
    audience: "<oidc-client-audience>"
    loginUrl: "<oidc-login-url>"
    tokenUrl: "<oidc-token-url>"
    issuer: "<oidc-issuer>"
    scopes:
      - "<scope-to-request>"
      - "<scope-to-request>"

Additional information for the user must come from LDAP (see LDAP).

There are some additional options under config.oidc that you may want to set:

config.oidc.loginParams: A mapping of additional parameters to send to the login route. Can be used to set additional configuration options for some OpenID Connect providers.
config.oidc.enrollmentUrl: If a username was not found for the unique identifier in the sub claim of the OpenID Connect ID token, redirect the user to this URL. This could, for example, be a form where the user can register for access to the deployment, or a page explaining how a user can get access.
config.oidc.usernameClaim: The claim of the OpenID Connect ID token from which to take the username. The default is uid.

Error pages#

To add additional information to the error page from a failed login, set config.errorFooter to a string. This string will be embedded verbatim, inside a <p> tag, in all login error messages. It may include HTML and will not be escaped. This is a suitable place to direct the user to support information or bug reporting instructions.

Administrators#

You may want to define the initial set of administrators:

config:
  initialAdmins:
    - "username"
    - "otheruser"

This makes the users username and otheruser (as authenticated by the upstream authentication provider configured below) admins, meaning that they can create, delete, and modify any authentication tokens. This value is only used when initializing a new Gafaelfawr database that does not contain any admins. Setting this is optional; you can instead use the bootstrap token (see Bootstrapping) to perform any administrative actions through the API.

LDAP#

When using OpenID Connect (either CILogon or generic), metadata about users (full name, email address, group membership, UID and GID, etc.) must come from an LDAP server. If the GitHub authentication provider is used, this information instead comes from GitHub and LDAP is not supported.

LDAP authentication#

Note

This section describes how the Gafaelfawr service itself authenticates to the LDAP server. Users are never authenticated using LDAP. User authentication always uses OpenID Connect or GitHub.

Gafaelfawr supports anonymous binds, simple binds (username and password), or Kerberos GSSAPI binds.

To use anonymous binds (the default), just specify the URL of the LDAP server with no additional bind configuration.

config:
  ldap:
    url: "ldaps://<ldap-server>"

To use simple binds, also specify the DN of the user to bind as. If this is set, ldap-password must be set in the Gafaelfawr Vault secret to the password to use with the simple bind.

config:
  ldap:
    url: "ldaps://<ldap-server>"
    userDn: "<bind-dn-of-user>"

To use Kerberos GSSAPI binds, provide a krb5.conf file that contains the necessary information to connect to your Kerberos server. Normally at least default_realm should be set. Including a full copy of your standard /etc/krb5.conf file should work. If this is set, ldap-keytab must be set in the Gafaelfawr Vault secret to the contents of a Kerberos keytab file to use for authentication to the LDAP server.

config:
  ldap:
    url: "ldaps://<ldap-server>"
    kerberosConfig: |
      [libdefaults]
        default_realm = EXAMPLE.ORG

      [realms]
        EXAMPLE.ORG = {
          kdc = kerberos.example.org
          kdc = kerberos-1.example.org
          kdc = kerberos-2.example.org
          default_domain = example.org
        }

LDAP groups#

Gafaelfawr must be told what the base DN of the group tree in LDAP is so that it can find a user’s group membership.

config:
  ldap:
    groupBaseDn: "<base-dn-for-search>"

You may need to set the following additional options under config.ldap depending on your LDAP schema:

By default, the GID number of the group is taken from the gidNumber attribute of the group. If Firestore support is enabled, the GIDs in LDAP are ignored and Gafaelfawr allocates GIDs from Firestore instead.

config.ldap.groupObjectClass: The object class from which group information should be looked up. Default: posixGroup.
config.ldap.groupMemberAttr: The member attribute of that object class. The values must match the username returned in the token from the OpenID Connect authentication server, or (if config.ldap.groupSearchByDn is set) the user DN formed from that username and the configuration options described in LDAP user information. Default: member.
config.ldap.groupSearchByDn: By default, Gafaelfawr searches the config.ldap.groupMemberAttr attribute for the user’s DN (formed by combining the username with config.ldap.userSearchAttr (as the attribute name for the first DN component containing the username) and config.ldap.userBaseDn (for the rest of the DN). This is the configuration used by most LDAP servers. If this option is set to false, the group tree is searched for the bare username instead.
config.ldap.addUserGroup: If set to true, add an additional group to the user’s group membership with a name equal to their username and a GID equal to their UID (provided they have a UID; if not, no group is added). Use this in environments with user private groups that do not appear in LDAP. In order to safely use this option, the GIDs of regular groups must be disjoint from user UIDs so that the user’s UID can safely be used as the GID of this synthetic group. Default: false.

The name of each group will be taken from the cn attribute and the GID will be taken from the gidNumber attribute.

LDAP user information#

For any authentication mechanism other than GitHub, Gafaelfawr looks up the user’s name, email, and, optionally, the numeric UID and GID in LDAP. Name and email are optional and allowed to be missing. To do this, Gafaelfawr must be told the base DN of the user tree in LDAP:

config:
  ldap:
    userBaseDn: "<base-dn-for-search>"

By default, this will get the name from the displayName attribute, the email (from the mail attribute, the UID from the uidNumber attribute, and the primary GID from the gidNumber attribute. These attribute names be overridden; see below. If any have multiple values, the first one will be used.

If this GID does not match the GID of any of the user’s groups, the corresponding group will be looked up in LDAP by GID and added to the user’s group list. This handles LDAP configurations where only supplemental group memberships are recorded in LDAP, and the primary group membership is recorded only via the user’s GID.

If config.ldap.gidAttr is set to null or the primary GID is missing from LDAP, but user private groups is enabled with addUserGroup: true, the primary GID will be set to the same as the UID. This is the same as the GID of the synthetic user private group. Otherwise, the primary GID will be left unset, which may break applications that require a primary GID.

If Firestore support is enabled, the UID and GID in LDAP are ignored and Gafaelfawr allocates UIDs and GIDs from Firestore instead.

You may need to set the following additional options under config.ldap depending on your LDAP schema:

config.ldap.emailAttr: The attribute from which to get the user’s email address. Set to null to not look up email addresses. Default: mail.
config.ldap.gidAttr: The attribute holding the user’s primary GID number. Set to null to not look up primary GID numbers from LDAP, although be aware that some services may require a primary GID. This attribute is only used if Firestore is not used for UID and GID assignment and config.ldap.addUserGroup is not set. Default: gidNumber.
config.ldap.nameAttr: The attribute from which to get the user’s full name. This attribute should hold the whole name that should be used, not just a surname or family name (which are not universally valid concepts anyway). Set to null to not look up full names. Default: displayName.
config.ldap.uidAttr: The attribute holding the user’s UID number. This can be set to null if UIDs should instead come from Firestore. Default: uidNumber.
config.ldap.userSearchAttr: The attribute holding the username, used to find the user’s entry. If config.ldap.groupSearchByDn is true (the default), this should also be the attribute used to construct the user DN. Default: uid.

Firestore UID/GID assignment#

Gafaelfawr can manage UID and GID assignment internally, using Google Firestore as the storage mechanism. Cloud SQL must also be enabled. The same service account used for Cloud SQL must have read/write permissions to Firestore.

When this support is enabled, Gafaelfawr ignores any UID and GID information from GitHub or LDAP, and instead assigns UIDs and GIDs to users and groups by name the first time that a given username or group name is seen. UIDs and GIDs are never reused. They are assigned from the ranges documented in DMTN-225.

To enable use of Firestore for UID/GID assignment, add the following configuration:

config:
  firestore:
    project: "<google-project-id>"

Set <google-project-id> to the name of the Google project for the Firestore data store. (Best practice is to make a dedicated project solely for Firestore, since there can only be one Firestore instance per Google project.)

Scopes#

Gafaelfawr takes group information from the upstream authentication provider or from LDAP and maps it to scopes. Scopes are then used to restrict access to protected services (see Configuring ingress with GafaelfawrIngress).

For a list of scopes used by the Rubin Science Platform, which may also be useful as an example for other deployments, see DMTN-235.

The list of scopes is configured via config.knownScopes, which is an object mapping scope names to human-readable descriptions. Every scope that you want to use must be listed in config.knownScopes. The default includes:

config:
  knownScopes:
    "admin:token": "Can create and modify tokens for any user"
    "admin:userinfo": "Can see user information for any user"
    "user:token": "Can create and modify user tokens"

which are used internally by Gafaelfawr, plus the scopes that are used by the Rubin Science Platform. You can add additional scopes by adding more key/value pairs to the config.knownScopes object in values-<environment>.yaml.

Once the scopes are configured, you will need to set up a mapping from groups to scope names using the groupMapping setting. This is a dictionary of scope names to lists of groups that provide that scope.

The group can be given in one of two ways: either a simple string giving the name of the group (used for CILogon and OpenID Connect authentication providers), or the GitHub organization and team specified with the following syntax:

github:
  organization: "lsst-sqre"
  team: "friends"

Both organization and team must be given. It is not possible to do access control based only on organizational membership.

The value of organization must be the login attribute of the organization, and the value of team must be the slug attribute of the team. (Generally the latter is the name of the team converted to lowercase with spaces and other special characters replaced with -.)

A complete setting for GitHub might look something like this:

config:
  groupMapping:
    "admin:token":
      - github:
          organization: "lsst-sqre"
          team: "square"
    "admin:userinfo":
      - github:
          organization: "lsst-sqre"
          team: "square"
    "exec:notebook":
      - github:
          organization: "lsst-sqre"
          team: "square"
      - github:
          organization: "lsst-sqre"
          team: "friends"
    "exec:portal":
      - github:
          organization: "lsst-sqre"
          team: "square"
      - github:
          organization: "lsst-sqre"
          team: "friends"
    "read:tap":
      - github:
          organization: "lsst-sqre"
          team: "square"
      - github:
          organization: "lsst-sqre"
          team: "friends"

Be aware that Gafaelfawr will convert these organization and team pairs to group names internally, and applications will see only the converted group names. See Groups from GitHub for more information.

When CILogon or generic OpenID Connect are used as the providers, the group information comes from LDAP. That group membership will then be used to determine scopes via the groupMapping configuration. For those authentication providers, the group names are simple strings. For example, suppose the Gafaelfawr configuration reads:

config:
  groupMapping:
    "exec:admin": ["foo", "bar"]

A user who is a member of the bar and other groups will have the exec:admin scope added to their token when it is issued.

Regardless of the config.groupMapping configuration, the user:token scope will be automatically added to the session token of any user authenticating via OpenID Connect or GitHub. The admin:token scope will be automatically added to any user marked as an admin in Gafaelfawr.

You will probably want to grant admin:userinfo scope to platform administrators so that they can use the /auth/api/v1/users/username route to retrieve the user information for a user from LDAP while debugging problems.

Quotas#

Gafaelfawr supports calculating user quotas based on group membership and providing quota information through its API. API quotas are also enforced directly by Gafaelfawr. See Quota management for more information.

Setting quotas is optional. Omit this setting if you don’t want to set user quotas.

Default quota#

The default quota setting controls the quotas that all users get when there are no more specific rules (discussed below). All of these keys are optional and may be omitted to not set that type of quota.

The api key should contain a mapping of service names to number of requests per minute. The keys for API quotas are names of services. This is the same name the service should use in the config.service key of a GafaelfawrIngress resource (see Configuring ingress with GafaelfawrIngress). If a service name has no corresponding quota setting, access to that service will be unrestricted.

The notebook key should contain cpu and memory keys specifying the default CPU and memory limits. The memory limit is given in a floating point number of GiB.

The tap key should contain a mapping of TAP service names to quotas for that service. Currently, the per-TAP-service quota only supports one key: concurrent, which specifies the number of concurrent TAP queries that a user may have in progress.

For example:

config:
  quota:
    default:
      api:
        datalinker: 100
      notebook:
        cpu: 2.0
        memory: 4.0
      tap:
        sso:
          concurrent: 10

This sets a quota of 100 requests per minute for the datalinker service, no quotas for any other API service, a default limit of 2.0 CPU equivalents and 4.0 GiB of memory for notebooks, and allows 10 concurrent TAP queries per user to the sso TAP service.

The default quota for all API services not listed is unlimited. To set a default quota of 0, explicitly list the API service with a quota of 0.

Group quota#

Second, the groups key sets additional quota granted to members of specific groups. The quota for every group of which the user is a member is added to the default quota. For example:

config:
  quota:
    groups:
      g_developers:
        api:
          datalinker: 50
        notebook:
          cpu: 0.0
          memory: 4.0

If this were combined with the above default quota, members of the g_developers group would receive a total of 150 requests per minute for datalinker, and a total of 8.0 GiB of memory for notebooks. The CPU quota for notebooks and the quota for TAP queries would be unchanged.

Members of specific groups cannot be granted unrestricted access to an API service since a missing key for a service instead means that this group contributes no additional quota for that service. Instead, grant effectively unlimited access by granting a very large quota number.

Normally, the group quota can only add to the individual quota. There are two exceptions: the spawn flag for notebooks, and any API quotas for services that have no default quotas. Consider the following addditional configuration:

config:
  quota:
    groups:
      g_limited:
        api:
          tap: 100
        notebook:
          cpu: 0.0
          memory: 0.0
          spawn: false

If combined with the previous default configuration, members of the g_limited group will have a quota of 100 requests per minute to the tap service. Users who are not a member of that group will continue to have unlimited access to the tap service. Also, members of the g_limited group will not be allowed to spawn new notebooks, because their spawn flag is set to false instead of the default of true. Note that cpu and memory are also set because they are required fields, but are set to 0.0 so they don’t add anything to the quota.

Bypass groups#

Finally, some groups can be allowed to bypass all quota limits. This is done with the bypass key.

config:
  quota:
    bypass:
      - "g_admins"

All members of any group listed under bypass will ignore all quota restrictions, including the spawn flag for notebook quotas.

Scaling#

Consider increasing the number of Gafaelfawr processes to run. This improves robustness and performance scaling. Production deployments should use at least two replicas.

replicaCount: 2

Gafaelfawr does not (yet) support Kubernetes auto-scaling.

Resource requests and limits#

Every component of Gafaelfawr defines Kubernetes resource requests and limits. Look for the resources key at the top level of the chart and in the portions of the chart for the underlying Gafaelfawr components.

The default limits and requests were set based on a fairly lightly loaded deployment that uses OpenID Connect as the authentication provider and LDAP for user metadata. For a heavily-loaded environment, you may need to increase the resource requests to reflect the expected resource consumption of your instance of Gafaelfawr and allow Kubernetes to do better scheduling. You will hopefully not need to increase the limits, which are generous.

Logging and proxies#

The default logging level of Gafaelfawr is info, which will log a message for every action it takes. To change this, set config.logLevel:

config:
  logLevel: "warning"

Valid values are debug (to increase the logging), info (the default), warning, or error. These values can be specified in any case.

Gafaelfawr is deployed behind a proxy server. In order to accurately log the IP address of the client, instead of the IP address of the proxy server, it must know what IP ranges correspond to possible proxy servers rather than clients. Set this with config.proxies:

config:
  proxies:
    - "192.0.2.0/24"

If not set, defaults to the RFC 1918 private address spaces. See Client IP addresses for more details.

Alerts, metrics, and tracing#

Metrics#

Gafaelfawr can export events and metrics to Sasquatch, the metrics system for Rubin Observatory. Metrics reporting is disabled by default. To enable it, set config.metrics.enabled to true:

config:
  metrics:
    enabled: true

Gafaelfawr will then use the Kafka user gafaelfawr to authenticate to Kafka and push various events. For a list of all of the events Gafaelfawr exports, see Metrics.

There are some additional configuration settings, which normally will not need to be changed:

config.metrics.application: Name of the application under which to log metrics. Default: gafaelfawr
config.metrics.events.topicPrefix: The prefix for events topics. Generally the only reason to change this is if you’re experimenting with new events in a development environment. Default: lsst.square.metrics.events
config.metrics.schemaManager.registryUrl: URL to the Confluent-compatible Kafka schema registry, used to register the schemas for events during startup. Default: Use the Sasquatch schema registry in the local cluster.
config.metrics.schemaManager.suffix: Suffix to add to all registered subjects. This avoids conflicts with existing registered schemas and may be useful when experimenting with possible event schema changes that are not backwards-compatible. Default: no suffix

Slack alerts#

Gafaelfawr can optionally report uncaught exceptions to Slack. To enable this, set config.slackAlerts:

config:
  slackAlerts: true

You will also have to set the slack-webhook key in the Gafaelfawr secret to the URL of the incoming webhook to use to post these alerts.

Sentry#

Gafaelfawr can optionally report uncaught exceptions, traces, and performance information to Sentry. To enable this, set config.enableSentry:

config:
  enableSentry: true

You will also have to set the sentry-dsn key in the Gafaelfawr secret to the URL to which the telemetry will be sent.

Maintenance#

Timing#

Gafaelfawr uses two Kubernetes CronJob resources to perform periodic maintenance and consistency checks on its data stores.

The maintenance job records history and deletes active entries for expired tokens, and truncates history tables as needed. By default, it is run hourly at five minutes past the hour. Its schedule can be set with config.maintenance.maintenanceSchedule (a cron schedule expression).

The audit job looks for data inconsistencies and reports them to Slack. Slack alerts must be configured. By default, it runs once a day at 03:00 in the time zone of the Kubernetes cluster. Its schedule can be set with config.maintenance.auditSchedule (a cron schedule expression).

Time limits#

By default, Gafaelfawr allows its maintenance and audit jobs five minutes to run, and cleans up any completed jobs older than one day. Kubernetes also deletes completed and failed jobs as necessary to maintain a cap on the number retained, which normally overrides the cleanup timing for the maintenance job that runs hourly.

To change the time limit for maintenance jobs (if, for instance, you have a huge user database or your database is very slow), set config.maintenance.deadlineSeconds to the length of time jobs are allowed to run for. To change the retention time for completed jobs, set config.maintenance.cleanupSeconds to the maximum lifetime of a completed job.

OpenID Connect server#

Gafaelfawr can act as an OpenID Connect identity provider for relying parties inside the Kubernetes cluster. To enable this, set config.oidcServer.enabled to true. If this is set, oidc-server-secrets and signing-key must be set in the Gafaelfawr Vault secret.

Gafaelfawr can provide an OpenID Connect ID token claim listing the data releases to which the user has access. To do so, it must be configured with a mapping of group names to data releases to which membership in that group grants access. This is done via the config.oidcServer.dataRightsMapping setting. For example:

config:
  oidcServer:
    dataRightsMapping:
      g_users:
        - "dp0.1"
        - "dp0.2"
        - "dp0.3"
      g_preview:
        - "dp0.1"

This configuration indicates members of the g_preview group have access to the dp0.1 release and members of the g_users group have access to all of dp0.1, dp0.2, and dp0.3. Users have access to the union of data releases across all of their group memberships.

See Configuring OpenID Connect for more information. See DMTN-253 for how this OpenID Connect support can be used by International Data Access Centers.

The following additional options customize the behavior of the OpenID Connect server:

config.oidcServer.issuer: The issuer identity (the iss claim in JWTs). Default: The base URL of the Phalanx environment.
config.oidcServer.keyId: The key ID of the signing key (the kid claim in JWTs). Default: gafaelfawr