Helm configuration¶
Gafaelfawr is configured as a Phalanx application, using the Helm chart in the Phalanx repository.
You will need to provide a values-environment.yaml
file for your Phalanx environment.
For examples, see the other values-environment.yaml
files in that directory.
In the below examples, the full key hierarchy is shown for each setting. For example:
config:
cilogon:
test: true
When writing a values-environment.yaml
chart, you should coalesce all settings so that each level of the hierarchy appears only once.
For example, there should be one top-level config:
key and all parameters that start with config.
should go under that key.
You should also read the Gafaelfawr application documentation. In particular, when bootstrapping a new Phalanx environment, see the Gafaelfawr bootstrapping instructions.
Basic settings¶
Database¶
Set the URL to the PostgreSQL database that Gafaelfawr will use:
config:
databaseUrl: "postgresql://gafaelfawr@example.com/gafaelfawr"
Do not include the password in the URL; instead, put the password in the database-password
key in the Vault secret.
If you are using Cloud SQL with the Cloud SQL Auth Proxy (see Cloud SQL), use localhost
for the hostname portion.
Alternately, if Gafaelfawr should use the cluster-internal PostgreSQL service, omit the config.databaseUrl
setting and instead add:
config:
internalDatabase: true
This option is primarily for test and development deployments and is not recommended for production use.
To enable database schema creation or upgrades, add:
config:
upgradeSchema: true
This will enable a Helm pre-install and pre-upgrade hook that will initialize or update the database schema before the rest of Gafaelfawr is installed or updated. This setting should be left off by default and only enabled when you know you want to initialize the database from scratch or update the schema. When updating the schema of an existing installation, all Gafaelfawr components should be stopped before syncing Gafaelfawr. See the Phalanx documentation for step-by-step instructions.
Error pages¶
To add additional information to the error page from a failed login, set config.errorFooter
to a string.
This string will be embedded verbatim, inside a <p>
tag, in all login error messages.
It may include HTML and will not be escaped.
This is a suitable place to direct the user to support information or bug reporting instructions.
Scaling¶
Consider increasing the number of Gafaelfawr processes to run. This improves robustness and performance scaling. Production deployments should use at least two replicas.
replicaCount: 2
Token lifetime¶
Change the token lifetime by setting config.tokenLifetime
.
The default is 30 days.
config:
tokenLifetime: 23h
Supported interval suffixes are w
(weeks), d
(days), h
(hours), m
(minutes), and s
(seconds).
Several values can be specified together.
For example, 1d6h23m
specifies a token lifetime of one day, six hours, and 23 minutes.
Administrators¶
You may want to define the initial set of administrators:
config:
initialAdmins:
- "username"
- "otheruser"
This makes the users username
and otheruser
(as authenticated by the upstream authentication provider configured below) admins, meaning that they can create, delete, and modify any authentication tokens.
This value is only used when initializing a new Gafaelfawr database that does not contain any admins.
Setting this is optional; you can instead use the bootstrap token (see Bootstrapping) to perform any administrative actions through the API.
Resource requests and limits¶
Every component of Gafaelfawr defines Kubernetes resource requests and limits.
Look for the resources
key at the top level of the chart and in the portions of the chart for the underlying Gafaelfawr components.
The default limits and requests were set based on a fairly lightly loaded deployment that uses OpenID Connect as the authentication provider and LDAP for user metadata. For a heavily-loaded environment, you may need to increase the resource requests to reflect the expected resource consumption of your instance of Gafaelfawr and allow Kubernetes to do better scheduling. You will hopefully not need to increase the limits, which are generous.
Authentication realm¶
The default authentication realm for WWW-Authenticate
headers, which is displayed as part of the HTTP Basic Authentication prompt in browsers, is the hostname of the Phalanx environment in which Gafaelfawr is installed.
This default can be overridden by setting config.realm
.
Base internal URL¶
Gafaelfawr needs to know the internal cluster DNS domain when creating Ingress
resources from GafaelfawrIngress
resources.
By default, Gafaelfawr assumes that the cluster DNS domain is svc.cluster.local
and the address to Gafaelfawr can be constructed by adding the name of the service and the name of the Gafaelfawr deployment namespace to the front of that domain.
If your cluster sets it to something else (by using the --cluster-domain
flag, for example), or if you are running Gafaelfawr in a vCluster but running the ingress outside of that vCluster, you will need to override the internal URL to Gafaelfawr by setting config.baseInternalUrl
.
config:
baseInternalUrl: "http://gafaelfawr.gafaelfawr.svc.example.com:8080"
The first component of the host name is the name of the Service
resource and therefore must be gafaelfawr
.
Always use a port of 8080.
Authentication provider¶
Configure GitHub, CILogon, or OpenID Connect as the upstream provider.
GitHub¶
config:
github:
clientId: "<github-client-id>"
using the GitHub client ID from GitHub.
When GitHub is used as the provider, group membership will be synthesized from GitHub team membership. See Groups from GitHub for more information.
CILogon¶
config:
cilogon:
clientId: "<cilogon-client-id>"
using the CILogon client ID from CILogon.
CILogon support assumes that COmanage is being used as the identity management system. Additional information about the authenticated user will be obtained from LDAP (see LDAP).
CILogon has some additional options under config.cilogon
that you may want to set:
config.cilogon.loginParams
A mapping of additional parameters to send to the CILogon authorize route. Can be used to set parameters like
skin
orselected_idp
. See the CILogon OIDC documentation for more information.config.cilogon.enrollmentUrl
If a username was not found for the CILogon unique identifier, redirect the user to this URL. This is intended for deployments using CILogon with COmanage for identity management. The enrollment URL will normally be the initial URL for a COmanage user-initiated enrollment flow.
config.cilogon.usernameClaim
The claim of the OpenID Connect ID token from which to take the username. The default is
username
.
Generic OpenID Connect¶
Gafaelfawr should be able to support most OpenID Connect servers as sources of authentication. This support has primarily been tested with Keycloak.
config:
oidc:
clientId: "<oidc-client-id>"
audience: "<oidc-client-audience>"
loginUrl: "<oidc-login-url>"
tokenUrl: "<oidc-token-url>"
issuer: "<oidc-issuer>"
scopes:
- "<scope-to-request>"
- "<scope-to-request>"
Additional information for the user must come from LDAP (see LDAP).
There are some additional options under config.oidc
that you may want to set:
config.oidc.loginParams
A mapping of additional parameters to send to the login route. Can be used to set additional configuration options for some OpenID Connect providers.
config.oidc.enrollmentUrl
If a username was not found for the unique identifier in the
sub
claim of the OpenID Connect ID token, redirect the user to this URL. This could, for example, be a form where the user can register for access to the deployment, or a page explaining how a user can get access.config.oidc.usernameClaim
The claim of the OpenID Connect ID token from which to take the username. The default is
uid
.
LDAP¶
When using OpenID Connect (either CILogon or generic), metadata about users (full name, email address, group membership, UID and GID, etc.) must come from an LDAP server. If the GitHub authentication provider is used, this information instead comes from GitHub and LDAP is not supported.
LDAP authentication¶
Note
This section describes how the Gafaelfawr service itself authenticates to the LDAP server. Users are never authenticated using LDAP. User authentication always uses OpenID Connect or GitHub.
Gafaelfawr supports anonymous binds, simple binds (username and password), or Kerberos GSSAPI binds.
To use anonymous binds (the default), just specify the URL of the LDAP server with no additional bind configuration.
config:
ldap:
url: "ldaps://<ldap-server>"
To use simple binds, also specify the DN of the user to bind as.
If this is set, ldap-password
must be set in the Gafaelfawr Vault secret to the password to use with the simple bind.
config:
ldap:
url: "ldaps://<ldap-server>"
userDn: "<bind-dn-of-user>"
To use Kerberos GSSAPI binds, provide a krb5.conf
file that contains the necessary information to connect to your Kerberos server.
Normally at least default_realm
should be set.
Including a full copy of your standard /etc/krb5.conf
file should work.
If this is set, ldap-keytab
must be set in the Gafaelfawr Vault secret to the contents of a Kerberos keytab file to use for authentication to the LDAP server.
config:
ldap:
url: "ldaps://<ldap-server>"
kerberosConfig: |
[libdefaults]
default_realm = EXAMPLE.ORG
[realms]
EXAMPLE.ORG = {
kdc = kerberos.example.org
kdc = kerberos-1.example.org
kdc = kerberos-2.example.org
default_domain = example.org
}
LDAP groups¶
Gafaelfawr must be told what the base DN of the group tree in LDAP is so that it can find a user’s group membership.
config:
ldap:
groupBaseDn: "<base-dn-for-search>"
You may need to set the following additional options under config.ldap
depending on your LDAP schema:
By default, the GID number of the group is taken from the gidNumber
attribute of the group.
If Firestore support is enabled, the GIDs in LDAP are ignored and Gafaelfawr allocates GIDs from Firestore instead.
config.ldap.groupObjectClass
The object class from which group information should be looked up. Default:
posixGroup
.config.ldap.groupMemberAttr
The member attribute of that object class. The values must match the username returned in the token from the OpenID Connect authentication server, or (if
config.ldap.groupSearchByDn
is set) the user DN formed from that username and the configuration options described in LDAP user information. Default:member
.config.ldap.groupSearchByDn
By default, Gafaelfawr searches the
config.ldap.groupMemberAttr
attribute for the user’s DN (formed by combining the username withconfig.ldap.userSearchAttr
(as the attribute name for the first DN component containing the username) andconfig.ldap.userBaseDn
(for the rest of the DN). This is the configuration used by most LDAP servers. If this option is set to false, the group tree is searched for the bare username instead.config.ldap.addUserGroup
If set to true, add an additional group to the user’s group membership with a name equal to their username and a GID equal to their UID (provided they have a UID; if not, no group is added). Use this in environments with user private groups that do not appear in LDAP. In order to safely use this option, the GIDs of regular groups must be disjoint from user UIDs so that the user’s UID can safely be used as the GID of this synthetic group. Default: false.
The name of each group will be taken from the cn
attribute and the GID will be taken from the gidNumber
attribute.
LDAP user information¶
For any authentication mechanism other than GitHub, Gafaelfawr looks up the user’s name, email, and, optionally, the numeric UID and GID in LDAP. Name and email are optional and allowed to be missing. To do this, Gafaelfawr must be told the base DN of the user tree in LDAP:
config:
ldap:
userBaseDn: "<base-dn-for-search>"
By default, this will get the name from the displayName
attribute, the email (from the mail
attribute, the UID from the uidNumber
attribute, and the primary GID from the gidNumber
attribute.
These attribute names be overridden; see below.
If any have multiple values, the first one will be used.
If this GID does not match the GID of any of the user’s groups, the corresponding group will be looked up in LDAP by GID and added to the user’s group list. This handles LDAP configurations where only supplemental group memberships are recorded in LDAP, and the primary group membership is recorded only via the user’s GID.
If config.ldap.gidAttr
is set to null or the primary GID is missing from LDAP, but user private groups is enabled with addUserGroup: true
, the primary GID will be set to the same as the UID.
This is the same as the GID of the synthetic user private group.
Otherwise, the primary GID will be left unset, which may break applications that require a primary GID.
If Firestore support is enabled, the UID and GID in LDAP are ignored and Gafaelfawr allocates UIDs and GIDs from Firestore instead.
You may need to set the following additional options under config.ldap
depending on your LDAP schema:
config.ldap.emailAttr
The attribute from which to get the user’s email address. Set to null to not look up email addresses. Default:
mail
.config.ldap.gidAttr
The attribute holding the user’s primary GID number. Set to null to not look up primary GID numbers from LDAP, although be aware that some services may require a primary GID. This attribute is only used if Firestore is not used for UID and GID assignment and
config.ldap.addUserGroup
is not set. Default:gidNumber
.config.ldap.nameAttr
The attribute from which to get the user’s full name. This attribute should hold the whole name that should be used, not just a surname or family name (which are not universally valid concepts anyway). Set to null to not look up full names. Default:
displayName
.config.ldap.uidAttr
The attribute holding the user’s UID number. This can be set to null if UIDs should instead come from Firestore. Default:
uidNumber
.config.ldap.userSearchAttr
The attribute holding the username, used to find the user’s entry. If
config.ldap.groupSearchByDn
is true (the default), this should also be the attribute used to construct the user DN. Default:uid
.
Firestore UID/GID assignment¶
Gafaelfawr can manage UID and GID assignment internally, using Google Firestore as the storage mechanism. Cloud SQL must also be enabled. The same service account used for Cloud SQL must have read/write permissions to Firestore.
When this support is enabled, Gafaelfawr ignores any UID and GID information from GitHub or LDAP, and instead assigns UIDs and GIDs to users and groups by name the first time that a given username or group name is seen. UIDs and GIDs are never reused. They are assigned from the ranges documented in DMTN-225.
To enable use of Firestore for UID/GID assignment, add the following configuration:
config:
firestore:
project: "<google-project-id>"
Set <google-project-id>
to the name of the Google project for the Firestore data store.
(Best practice is to make a dedicated project solely for Firestore, since there can only be one Firestore instance per Google project.)
Scopes¶
Gafaelfawr takes group information from the upstream authentication provider or from LDAP and maps it to scopes. Scopes are then used to restrict access to protected services (see Configuring ingress with GafaelfawrIngress).
For a list of scopes used by the Rubin Science Platform, which may also be useful as an example for other deployments, see DMTN-235.
The list of scopes is configured via config.knownScopes
, which is an object mapping scope names to human-readable descriptions.
Every scope that you want to use must be listed in config.knownScopes
.
The default includes:
config:
knownScopes:
"admin:token": "Can create and modify tokens for any user"
"user:token": "Can create and modify user tokens"
which are used internally by Gafaelfawr, plus the scopes that are used by the Rubin Science Platform.
You can add additional scopes by adding more key/value pairs to the config.knownScopes
object in values-<environment>.yaml
.
Once the scopes are configured, you will need to set up a mapping from groups to scope names using the groupMapping
setting.
This is a dictionary of scope names to lists of groups that provide that scope.
The group can be given in one of two ways: either a simple string giving the name of the group (used for CILogon and OpenID Connect authentication providers), or the GitHub organization and team specified with the following syntax:
github:
organization: "lsst-sqre"
team: "friends"
Both organization
and team
must be given.
It is not possible to do access control based only on organizational membership.
The value of organization
must be the login
attribute of the organization, and the value of team
must be the slug
attribute of the team.
(Generally the latter is the name of the team converted to lowercase with spaces and other special characters replaced with -
.)
A complete setting for GitHub might look something like this:
config:
groupMapping:
"admin:token":
- github:
organization: "lsst-sqre"
team: "square"
"exec:notebook":
- github:
organization: "lsst-sqre"
team: "square"
- github:
organization: "lsst-sqre"
team: "friends"
"exec:portal":
- github:
organization: "lsst-sqre"
team: "square"
- github:
organization: "lsst-sqre"
team: "friends"
"read:tap":
- github:
organization: "lsst-sqre"
team: "square"
- github:
organization: "lsst-sqre"
team: "friends"
Be aware that Gafaelfawr will convert these organization and team pairs to group names internally, and applications will see only the converted group names. See Groups from GitHub for more information.
When CILogon or generic OpenID Connect are used as the providers, the group information comes from LDAP.
That group membership will then be used to determine scopes via the groupMapping
configuration.
For those authentication providers, the group names are simple strings.
For example, suppose the Gafaelfawr configuration reads:
config:
groupMapping:
"exec:admin": ["foo", "bar"]
A user who is a member of the bar
and other
groups will have the exec:admin
scope added to their token when it is issued.
Regardless of the config.groupMapping
configuration, the user:token
scope will be automatically added to the session token of any user authenticating via OpenID Connect or GitHub.
The admin:token
scope will be automatically added to any user marked as an admin in Gafaelfawr.
Quotas¶
Gafaelfawr supports calculating user quotas based on group membership and providing quota information through its API. These quotas are not enforced by Gafaelfawr.
To configure quotas, set a base quota for all users, and then optionally add additional quota for members of specific groups. Here is an example:
config:
quota:
default:
api:
datalinker: 1000
notebook:
cpu: 2.0
memory: 4.0
groups:
g_developers:
notebook:
cpu: 8.0
memory: 4.0
API quotas are in requests per 15 minutes. Notebook quotas are in CPU equivalents and GiB of memory.
The above example sets an API quota for the datalinker
service of 1000 requests per 15 minutes, and a default quota for user notebooks of 2.0 CPU equivalents and 4.0GiB of memory.
Users who are members of the g_developers
group get an additional 4.0GiB of memory for their notebooks.
The keys for API quotas are names of services.
This is the same name the service should use in the config.service
key of a GafaelfawrIngress
resource (see Configuring ingress with GafaelfawrIngress).
If a service name has no corresponding quota setting, access to that service will be unrestricted.
All group stanzas matching the group membership of a user are added to the default
quota, and the results are reported as the quota for that user by the user information API.
Members of specific groups cannot be granted unrestricted access to an API service since a missing key for a service instead means that this group contributes no additional quota for that service. Instead, grant effectively unlimited access by granting a very large quota number.
Redis storage¶
For any Gafaelfawr deployment other than a test instance, you will want to configure persistent storage for Redis. Otherwise, each upgrade of Gafaelfawr’s Redis component will invalidate all of the tokens.
By default, the Gafaelfawr Helm chart uses auto-provisioning to create a PersistentVolumeClaim
with the default storage class, requesting 1GiB of storage with the ReadWriteOnce
access mode.
If this is suitable for your deployment, you can leave the configuration as is.
Otherwise, you can adjust the size (you probably won’t need to make it larger; Gafaelfawr’s storage needs are modest), storage class, or access mode by setting redis.persistence.size
, redis.persistence.storageClass
, and redis.persistence.accessMode
.
If you instead want to manage the persistent volume directly rather than using auto-provisioning, use a configuration such as:
redis:
persistence:
volumeClaimName: "gafaelfawr-pvc"
to point to an existing PersistentVolumeClaim
.
You can then create that PersistentVolumeClaim
and its associated PersistentVolume
via any mechanism you choose, and the volume pointed to by that claim will be mounted as the Redis volume.
Gafaelfawr uses the standard Redis Docker image, so the volume must be writable by UID 999, GID 999 (which the StatefulSet
will attempt to ensure using the Kubernetes fsGroup
setting).
Finally, if you do have a test installation where you don’t mind invalidating all tokens whenever Redis is restarted, you can use:
redis:
persistence:
enabled: false
This will use an ephemeral emptyDir
volume for Redis storage.
Cloud SQL¶
If the PostgreSQL database that Gafaelfawr should use is a Google Cloud SQL database, Gafaelfawr supports using the Cloud SQL Auth Proxy via Workload Identity.
First, follow the normal setup instructions for Cloud SQL Auth Proxy using Workload Identity.
You do not need to create the Kubernetes service account; two service accounts will be created by the Gafaelfawr Helm chart.
The names of those service accounts are gafaelfawr
and gafaelfawr-operator
, both in Gafaelfawr’s Kubernetes namespace (by default, gafaelfawr
).
Then, once you have the name of the Google service account for the Cloud SQL Auth Proxy (created in the above instructions), enable the Cloud SQL Auth Proxy sidecar in the Gafaelfawr Helm chart. An example configuration:
cloudsql:
enabled: true
instanceConnectionName: "dev-7696:us-central1:dev-e9e11de2"
serviceAccount: "gafaelfawr@dev-7696.iam.gserviceaccount.com"
Replace instanceConnectionName
and serviceAccount
with the values for your environment.
You will still need to set config.databaseUrl
and the database-password
key in the Vault secret with appropriate values, but use localhost
for the hostname in config.databaseUrl
.
As mentioned in the Google documentation, the Cloud SQL Auth Proxy does not support IAM authentication to the database, only password authentication, and IAM authentication is not recommended for connection pools for long-lived processes. Gafaelfawr therefore doesn’t support IAM authentication to the database.
Logging and proxies¶
The default logging level of Gafaelfawr is info
, which will log a message for every action it takes.
To change this, set config.logLevel
:
config:
logLevel: "warning"
Valid values are debug
(to increase the logging), info
(the default), warning
, or error
.
These values can be specified in any case.
Gafaelfawr is deployed behind a proxy server.
In order to accurately log the IP address of the client, instead of the IP address of the proxy server, it must know what IP ranges correspond to possible proxy servers rather than clients.
Set this with config.proxies
:
config:
proxies:
- "192.0.2.0/24"
If not set, defaults to the RFC 1918 private address spaces. See Client IP addresses for more details.
Alerts, metrics, and tracing¶
Metrics¶
Gafaelfawr can export events and metrics to Sasquatch, the metrics system for Rubin Observatory.
Metrics reporting is disabled by default.
To enable it, set config.metrics.enabled
to true:
config:
metrics:
enabled: true
Gafaelfawr will then use the Kafka user gafaelfawr
to authenticate to Kafka and push various events.
For a list of all of the events Gafaelfawr exports, see Metrics.
There are some additional configuration settings, which normally will not need to be changed:
config.metrics.application
Name of the application under which to log metrics. Default:
gafaelfawr
config.metrics.events.topicPrefix
The prefix for events topics. Generally the only reason to change this is if you’re experimenting with new events in a development environment. Default:
lsst.square.metrics.events
config.metrics.schemaManager.registryUrl
URL to the Confluent-compatible Kafka schema registry, used to register the schemas for events during startup. Default: Use the Sasquatch schema registry in the local cluster.
config.metrics.schemaManager.suffix
Suffix to add to all registered subjects. This avoids conflicts with existing registered schemas and may be useful when experimenting with possible event schema changes that are not backwards-compatible. Default: no suffix
Slack alerts¶
Gafaelfawr can optionally report uncaught exceptions to Slack.
To enable this, set config.slackAlerts
:
config:
slackAlerts: true
You will also have to set the slack-webhook
key in the Gafaelfawr secret to the URL of the incoming webhook to use to post these alerts.
Sentry¶
Gafaelfawr can optionally report uncaught exceptions, traces, and performance information to Sentry.
To enable this, set config.enableSentry
:
config:
enableSentry: true
You will also have to set the sentry-dsn
key in the Gafaelfawr secret to the URL to which the telemetry will be sent.
Maintenance¶
Timing¶
Gafaelfawr uses two Kubernetes CronJob
resources to perform periodic maintenance and consistency checks on its data stores.
The maintenance job records history and deletes active entries for expired tokens, and truncates history tables as needed.
By default, it is run hourly at five minutes past the hour.
Its schedule can be set with config.maintenance.maintenanceSchedule
(a cron schedule expression).
The audit job looks for data inconsistencies and reports them to Slack.
Slack alerts must be configured.
By default, it runs once a day at 03:00 in the time zone of the Kubernetes cluster.
Its schedule can be set with config.maintenance.auditSchedule
(a cron schedule expression).
Time limits¶
By default, Gafaelfawr allows its maintenance and audit jobs five minutes to run, and cleans up any completed jobs older than one day. Kubernetes also deletes completed and failed jobs as necessary to maintain a cap on the number retained, which normally overrides the cleanup timing for the maintenance job that runs hourly.
To change the time limit for maintenance jobs (if, for instance, you have a huge user database or your database is very slow), set config.maintenance.deadlineSeconds
to the length of time jobs are allowed to run for.
To change the retention time for completed jobs, set config.maintenance.cleanupSeconds
to the maximum lifetime of a completed job.
OpenID Connect server¶
Gafaelfawr can act as an OpenID Connect identity provider for relying parties inside the Kubernetes cluster.
To enable this, set config.oidcServer.enabled
to true.
If this is set, oidc-server-secrets
and signing-key
must be set in the Gafaelfawr Vault secret.
Gafaelfawr can provide an OpenID Connect ID token claim listing the data releases to which the user has access.
To do so, it must be configured with a mapping of group names to data releases to which membership in that group grants access.
This is done via the config.oidcServer.dataRightsMapping
setting.
For example:
config:
oidcServer:
dataRightsMapping:
g_users:
- "dp0.1"
- "dp0.2"
- "dp0.3"
g_preview:
- "dp0.1"
This configuration indicates members of the g_preview
group have access to the dp0.1
release and members of the g_users
group have access to all of dp0.1
, dp0.2
, and dp0.3
.
Users have access to the union of data releases across all of their group memberships.
See Configuring OpenID Connect for more information. See DMTN-253 for how this OpenID Connect support can be used by International Data Access Centers.
The following additional options customize the behavior of the OpenID Connect server:
config.oidcServer.issuer
The issuer identity (the
iss
claim in JWTs). Default: The base URL of the Phalanx environment.config.oidcServer.keyId
The key ID of the signing key (the
kid
claim in JWTs). Default:gafaelfawr