Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 0 additions & 4 deletions .editorconfig
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,6 @@ indent_size = 2
indent_style = space
indent_size = 2

[config.ru]
indent_style = tab
indent_size = 4

[*.rb]
indent_style = space
indent_size = 2
Expand Down
1 change: 0 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,6 @@ Concepts:
Tasks:

* [Deployment guide](docs/deploy.md)
* [Editing diagrams](docs/editing-diagrams.md)

Organizational (for team members):

Expand Down
53 changes: 45 additions & 8 deletions apiserver/README.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,54 @@
# API server

The API server is a service that allows performing limited management operations on the infrastructure. It mainly exists to securely allow the Server Edition's CI to tell Caddy about the fact that new packages have been deployed.
The API server is a small service that exposes a couple of privileged management
operations to the project's GitHub Actions workflows — primarily so that CI can
trigger restarts of Caddy and self-upgrades of the API server itself, without
needing SSH access to the backend host.

Presently, there is only one endpoint: `POST /admin/restart_web_server`. This endpoint initiates restarting of Caddy, but does not wait for it to finish.
It runs as a systemd-managed Puma process (`apiserver.service`) on the Hetzner
backend host, listening on a Unix socket at `/run/apiserver/server.sock`. Caddy
reverse-proxies the `/admin/*` paths from `apt.fullstaqruby.org` and
`yum.fullstaqruby.org` to that socket — there is no dedicated `apiserver.*`
hostname.

## Calling the production instance
## Endpoints

The production instance is deployed at https://apiserver-f7awo4fcoa-uk.a.run.app/. To call it, you need to include your Google Cloud identity token in the Authorization header.
- `GET /` — health check, returns `ok`.
- `POST /admin/upgrade_apiserver` — kicks off `apiserver-deployer` (which fetches
the latest API server release from GitHub and activates it), then restarts
`apiserver` itself. Callable only from `fullstaq-ruby/infra`'s `deploy`
GitHub Actions environment.
- `POST /admin/restart_web_server` — restarts Caddy. Callable only from
`fullstaq-ruby/server-edition`'s `deploy` GitHub Actions environment.

```bash
curl -v -H "Authorization: Bearer $(gcloud auth print-identity-token)" https://apiserver-f7awo4fcoa-uk.a.run.app/
```
## Authentication

The API server authenticates callers using **GitHub Actions OIDC**. Every
request must carry an `Authorization: Bearer <token>` header where the token
is an ID token minted by GitHub's OIDC provider with audience claim
`backend.fullstaqruby.org`. The server verifies the JWT signature against
GitHub's JWKS and rejects the request unless the token's `repository`, `sub`,
`runner_environment`, and `environment` claims match the calling repo's
expected `deploy` environment for that endpoint.

Because the audience and claim shape are tied to GitHub-hosted runners, the
endpoints are not callable directly by a human or from a local machine.

## Continuous deployment

New API server code changes, when pushed to master, are automatically deployed by the Infrastructure project's CI.
`.github/workflows/apiserver.yml` builds and deploys the API server. Pushes
that touch `apiserver/**` (or the workflow file itself) trigger a build, and
pushes to `main` additionally trigger the `deploy` job — which tags the commit,
publishes a GitHub release with the build artifact, and calls
`POST https://apt.fullstaqruby.org/admin/upgrade_apiserver` with a freshly
minted OIDC token.

On the host, `apiserver-deployer` (a oneshot systemd unit running
`/usr/local/bin/apiserver-deployer`) handles the actual install: it fetches
the latest release metadata from the GitHub API, downloads the asset matching
the host's distribution and architecture, extracts it into
`/opt/apiserver/versions/<tag>-<dist>-<version>-<arch>`, installs any runtime
dependencies declared in `dpkg-dependencies.txt`, prunes all but the last five
versions, and atomically swaps the `/opt/apiserver/versions/latest` symlink.
The API server's working directory points at that symlink, so the subsequent
`systemctl restart apiserver` brings the new version online.
18 changes: 5 additions & 13 deletions docs/deploy.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,20 +56,12 @@ This guide explains how to deploy infrastructure updates. This guide is not for
cd ..
~~~

4. Get the credentials for the Kubernetes cluster:
4. Apply Ansible to the backend VM:

~~~bash
gcloud container clusters get-credentials fullstaq-ruby-autopilot --configuration fullstaq-ruby --region us-east4
cd ansible
ansible-playbook -i hosts.ini -v main.yml
cd ..
~~~

5. Set the default namespace:

~~~bash
kubectl config set-context --current --namespace=fullstaq-ruby
~~~

6. Apply the Kustomization:

~~~bash
kubectl apply --context=gke_fullstaq-ruby_us-east4_fullstaq-ruby-autopilot -k ../kubernetes
~~~
> The API server itself is not deployed by this playbook. Code changes under `apiserver/` are released by the `.github/workflows/apiserver.yml` workflow, which packages a tarball, attaches it to a GitHub Release, and triggers `POST /admin/upgrade_apiserver` on the live host.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nowadays it's the entire .github/workflows/ folder (multiple workflows that together do the build and deployment)

12 changes: 0 additions & 12 deletions docs/editing-diagrams.md

This file was deleted.

12 changes: 6 additions & 6 deletions docs/infrastructure-as-code.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,19 +3,19 @@
We define as much infrastructure as possible in the form of code, using:

* [Terraform](https://terraform.io)
* Kubernetes YAML, managed with [Kustomize](https://kustomize.io/)
* [Ansible](https://www.ansible.com/)
* Github Actions

The infrastructure-as-code is stored in the following directories:

* `terraform/` — Infrastructure administered by [Infra Maintainers](roles.md), except for resources inside Kubernetes. Most of the infrastructure is defined here.
* `terraform/` — Infrastructure administered by [Infra Maintainers](roles.md). Most of the cloud-side infrastructure is defined here.

* `terraform-hisec/` — Infrastructure administered by [Infra Owners](roles.md). This covers for example resources in the `fullstaq-ruby-hisec` Google Cloud project.
* `terraform-hisec/` — Infrastructure administered by [Infra Owners](roles.md). This covers for example sensitive resources such as the GPG signing key in Azure Key Vault, and the high-security Terraform state backend.

Because we don't expect the infrastructure in this directory to change very often, we've chosen — for security reasons — not to run Terraform in a CI/CD pipeline. This way we don't have to worry about the security of the CI/CD pipeline's service account. Instead, an [Infra Owner](roles.md) runs Terraform manually, using that person's personal Google Cloud credentials.
Because we don't expect the infrastructure in this directory to change very often, we've chosen — for security reasons — not to run Terraform in a CI/CD pipeline. This way we don't have to worry about the security of any CI/CD pipeline credentials. Instead, an [Infra Owner](roles.md) runs Terraform manually, using their personal cloud credentials.

* `kubernetes/` — Kubernetes resources administered by [Infra Maintainers](roles.md).
* `ansible/` — Configuration of the backend VM (Caddy, the API server, Prometheus, and OS hardening). Administered by [Infra Maintainers](roles.md) and applied manually; see [Deployment guide](deploy.md).

* `.github/workflows/apiserver.yml` — Deploys the API server.
* `.github/workflows/apiserver.yml` — Builds and deploys the API server.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nowadays it's .github/workflows/ (multiple workflows that together do the build and deployment).


Note that not all infrastructure can, or (for security reasons) should, be managed via code. Learn more at [Infrastructure bootstrapping](infrastructure-bootstrapping.md).
2 changes: 1 addition & 1 deletion docs/infrastructure-bootstrapping.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Infrastructure bootstrapping

We try to codify infrastructure as much as possible using Terraform and Kubernetes YAML. However:
We try to codify infrastructure as much as possible using Terraform and Ansible. However:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be an instruction step in this document for deploying the API server.


- Not everything _can_ be automated. For example, we need to setup Azure Blob Storage for storing Terraform state, before we can use Terraform.
- Not everything _should_ be automated. For example, the `fullstaq-ruby-hisec` project contains such sensitive data, that giving access to CI/CD systems would pose a security risk.
Expand Down
Loading