Introduction

Dit is de centrale locatie voor alle informatie over de website en de .ComCom.

Wil je iets weten over het gebruik van de website? Klik hier.

Wil je een suggestie doen voor de website of iets vragen aan de .ComCom (bijvoorbeeld dat we een stukje content aanpassen)? Klik hier.

Wil je iets weten over hoe de .ComCom werkt, of over wat je kunt leren bij de .ComCom? Klik hier.

The rest will be dedicated to explaining technical details, the choices we made, how to setup a development environment, how to deploy to production (and what that even means). We use English for the sake of future compatibility and so that it is easy to share anything written here online (and because many terms don't have Dutch equivalents).

Changelog

Hier houden we alle veranderingen bij. Dit gaat specifiek om code, niet om content. De frontend wordt vaker geüpdate dan hier weergegeven (en heeft ook geen specifieke versies), maar de belangrijke veranderingen zijn wel hier samengevat.

Frontend (0232c32) - 2024-04-05

Changed (frontend)

(February/March) Jesper has updated a lot of content, including most of the commission photos
(March) Theme has been updated from winter to spring
(March) Records have been updated

Added (frontend)

A new "Pas aan" feature has been added for admins to manage the klassementen. This makes use of the 2.2 backend update.
- It allows you to start new classifications
- You can select classifications and modify their start, end and freeze date
- You can make the database recompute the points based on whether or not you want to show the points added after freezing
Members can see when an event was last added and when it was frozen
(February) An Oud Leden Dodeka (OLD) page has been added

2.2.1 - 2024-01-10

Tip heeft code aan deze release code bijgedragen.

Released into production on 2024-01-11.

Deploy

Fix production image version

2.2.0 - 2024-01-10

Note: this version was not released into production.

This is the first full tidploy release on the backend with the mailserver moved to a dedicated provider. This should make all required features for the classifications fully work.

Senne en Tip hebben code aan deze release bijgedragen. We are now at 591 commits for the backend (including the 150+ commits that were from the deployment repository).

Added (backend)

<server>/admin/class/get_meta/{recent_number}/ (this returns the most recent classifications of each type, recent_number/2 for each type)
<server>/admin/class/new/ (creates a new points and training classification)
<server>/admin/class/modify/ (modifies a classification)
<server>/admin/class/remove/{class_id} (removes classification with id class_id)
<server>/member/get_with_info/{rank_type}/ (Will replace <server>/members/class/get/{rank_type}/, but added with different name to avoid breaking change, this also returns last_updated and whether the classification is frozen)
The last_updated column is now updated when new events are added.
(Senne) <server>/admin/update/training has been added (not yet in use by frontend) as the start of the training registration update.
New update_by_unique store method for updating an entire row based on a simple where condition.

Internal (backend)

The backend and deployment repository have been merged together, making deployment and versioning a lot easier. We now use a bunch of Nu scripts instead of shell scripts and the deployment will now use tidploy. This done in a series of commits from 4602abd to around 414bc23.
Logging has been much improved throughout the backend, so errors can now be properly seen. Error handling has also been improved (especially for the store module). Some common startup errors now have better messages that indicate what's wrong. Documentation has been improved and all database queries now properly catch some specific errors. See #31.

2.1.1 - 2023-12-05

Tip heeft code aan deze release code bijgedragen.

Released into production on 2023-12-05.

Fixed (backend)

Remove debug eduinstitution value during registration

2.1.0 - 2023-12-05

Note: this version was not released into production.

Matthijs en Tip hebben code aan deze release bijgedragen.

Stats

We are now at 359 commits on the backend and 922 commits on the frontend.

Added (frontend)

An explanation on the classification has been added to the classification page.
There is now an NSK Meerkamp role.

Fixed (frontend)

Last names are no longer all caps on the classification page and are shown in full.

Added (authpage)

Add Dodeka logo to all pages, using new Title component.

Changed (authpage)

Use flex layout to make alignment better when pressing "forgot password", so no the layout doesn't jump slightly
Update node version to v20
Update dependencies

Fixed (authpage)

Educational institution is now recorded if not changed from default (TU Delft)

Added (backend)

Admin: Synchronization of the total points per user and events, as well as a more consistent naming scheme for the endpoints. All old endpoints are retained for backwards compatibility. Furthermore, admins can now request additional information about events on a user, event or class basis (see the PR).
- <server>/admin/class/sync/ (Force synchronization of the total points per user, without having to add a new event)
- <server>/admin/class/update/ (Same as previous <server>/admin/ranking/update, which still exists for backwards compatibility)
- <server>/admin/class/get/{rank_type}/ (Same as previous <server>/admin/classificaiton/{rank_type}/, which still exists for backwards compatibility)
- <server>/admin/class/events/user/{user_id}/ (Get all events for a specific user_id, with the class_id and rank_type as query parameters)
- <server>/admin/class/events/all// (Get all events, with the specific class_id and rank_type as query parameters)
- <server>/admin/class/users/event/{event_id}/ (Get all users for a specific event_id)
Member: Only renames, as described above.
- <server>/members/class/get/{rank_type}/ (Same as previous <server>/members/classificaiton/{rank_type}/, which still exists for backwards compatibility)
- <server>/members/profile/ (Same as previous <server>/res/profile, which still exists for backwards compatibility)

Changed (backend)

Types: The entire backend now passes mypy's type checker (see the PR)!
Better context/dependency injection: The previous system was not perfect and it was still not easy to write tests. Lots of improvements have been made, utilizing FastAPI Depends and making it possible to easily wrap a single function call to make the caller testable. See #64, #65, #70 and #71.
Better logging: Logging had been lackluster while waiting for a better solution. This has now arrived with the adoption of loguru. Logging is now much more nicely formatted and it will be easily possible in the future to collect and show the logs in a central place, although that is not yet implemented. Some of the startup code has also been refactored as part of the logging effort.
Check for role on router basis: For certain routers, we now check whether they are requested by admins or members for all routes inside the router, making it harder to forget to add a check. The header checking logic has also been refactored and some tests have been added. Much better than the manual if check we did before. This also includes some minor refactor and fixes for access token verification.
There are now different router tags, which makes it easier to find all the different API endpoints in the OpenAPI docs view.

Fixed (backend)

An error is no longer thrown on the backend when a password reset is requested for a user that does not exist.

Internal (backend)

Live query tests: in the GitHub Actions CI we now actually run some tests against a live database using Actions service containers. This means we can be much more sure that we did not completely break database functionality after passing the tests. PR
Add request_id to logger using loguru's contextualize
Added logging to all major user flows (signup, onboard, change email/password), also allowing the display of reset URL's etc. so email doesn't have to be turned on during local development

2.0.1 - 2023-10-25

Tip heeft code aan deze release bijgedragen.

Released into production on 2023-10-25.

Fixed (backend)

Fix update email: If you requested an email change twice, but only confirmed this after they were both sent, it is no longer to change it twice. After changing it using either one, the other one is invalidated.
Changed package structure so it is possible to extract the schema package and load it on the production server to run database schema migrations.

2.0.0 - 2023-10-17

Note: this version was not released into production.

Leander, Matthijs en Tip hebben code aan deze release bijgedragen.

Added (backend)

Admin: Roles, using OAuth scope mechanism, as well as classifications stored in the database, computed based on each event.
- <server>/admin/scopes/all/ (Get scopes for all users)
- <server>/admin/scopes/add/ (Add scope for a user)
- <server>/admin/scopes/remove/ (Remove scope for a user)
- <server>/admin/users/ids/ (Get all user ids)
- <server>/admin/users/names/ (Get all user names, to match for rankings)
- <server>/admin/ranking/update (Add an event for classifications)
- <server>/admin/classification/{rank_type}/ (See current points for a specific classification)
Member:
- <server>/members/classification/{rank_type}/ (See current points for a specific classification, changes hidden after certain point)

Added (frontend)

Member -> Classification page
Admin -> Classification page and Add new event
Roles can be changed in user overview

Changed (backend)

Major refactor of backend code, which separates auth code from app-specific code
Updated some major dependencies, including Pydantic to v2
Database schema update:
- classifications table added to store a classification, which lasts for half a year and can be either "points" or "training".
- class_events table added, which stores all events that have been held (borrel, NSK, training, ...). Possibly related to a specific classification.
- class_event_points table added, which stores how many points a specific user has received for a specific event. In general, users will have the same amount of points per event, but this flexibility allows us to change that later.
- class_points table added, which stores the aggregated total points of a user for a specific classification. When an event is added, this table should be updated using the correct update query.

Changed (postgres)

Updated from Postgres 14 to 16

Changed (redis)

Updated from Redis 6.2 to 7.2

1.1.0 - 2023-3-18

Released into production.

Changed (backend)

Update dependencies, including updating Python to 3.11 and SQLAlchemy 2

Fixed (server)

Docker conatiner no longer accumulates core dumps, crashing the server after 1-2 weeks

1.0.0 - 2023-1-11

Initial release of the FastAPI backend server, PostgreSQL database and Redis key-value store. Released into production on 2023-1-11.

Added (backend)

Login: Mostly OAuth 2.1-compliant authentication/authorization system, using the authorization code flow. Authentication is done using OPAQUE:
- <server>/oauth/authorize/ (Authorization Endpoint initialize)
- <server>/oauth/callback/ (Authorization Endpoint after authentication)
- <server>/oauth/token/ (Token Endpoint)
- <server>/login/start/ (Start OPAQUE password authentication)
- <server>/login/finish/ (Finish OPAQUE authentication)
Registration: Registration/onboarding flow, which requires confirmation of AV`40 signup.
- <server>/onboard/signup/ (Initiate signup, board will confirm)
- <server>/onboard/email/ (Confirm email)
- <server>/onboard/confirm/ (Board confirms signup)
- <server>/onboard/register/ (Start OPAQUE registration)
- <server>/onboard/finish/ (Finish OPAQUE registration and send registration info)
Update: Some information needs to be updated or changed.
- <server>/update/password/reset/ (Initiate password reset)
- <server>/update/password/start/ (Start OPAQUE set new password)
- <server>/update/password/finish/ (Finsih OPAQUE set new password)
- <server>/update/email/send/ (Initiate email change)
- <server>/update/email/check/ (Finish email change after authentication)
- <server>/update/delete/url/ (Delete account start, create url)
- <server>/update/delete/check/ (Confirm deletion after authentication)
Admin: Get information only with admin account.
- <server>/admin/users/ (Get all user data)
Members: Get information for members.
- <server>/members/birthdays/ (Get member birthdays)

Added (authpage)

Login page
Registration page
Other confirmation pages necessary for backend functionality

Added (frontend)

Profile page
Leden page -> Verjaardagen
Admin page -> Ledenoverzicht
Use React Query for getting data from backend
Use Context from React for authentication state
Redirect pages necessary for OAuth
Confirmation pages for info update

Added (server)

Docker container that contains the FastAPI backend server, as well as static files for serving the authpage.

Added (postgres)

Docker conatiner with PostgreSQL server

Added (redis)

Docker container with Redis server

Pre-1.0.0

The frontend went live in June 2021 and before the release of the backend, was regularly updated using a rolling release schedule. The frontend is not versioned.

Hoe gebruik je de website?

Ben je bestuur, en wil je iets over de adminfunctionaliteiten weten? Klik hier.

Ben je lid van D.S.A.V. Dodeka en wil je meer weten over wat je op de website kan doen? Dan kun je op de volgende pagina's kijken:

Ben je lid van een commissie en wil je weten hoe je de website kunt gebruiken? Dan kun je bij de volgende onderwerpen terecht. Staat wat je wil er niet bij, neem dan contact met ons op. Er is veel mogelijk!

Commissiemail

Account aanmaken

Een account aanmaken begint bij de Word lid!-pagina op de website.

Plaatje van de Word lid!-pagina met "Schrijf je in" button.

Druk op de "Schrijf je in!" knop. Dit opent een scherm waar je wat basisinformatie aan ons kunt doorgeven. Je kunt je alleen inschrijven als je akkoord gaat met het privacybeleid, want we moeten informatie opslaan als je inschrijft bij Dodeka.

Na het invullen van je gegevens, druk je op "Schrijf je in via AV`40". AV`40 is onze moedervereniging. Als je lid wordt bij Dodeka, word je ook lid van AV`40. De officiële ledenadministratie loopt ook via AV`40, dus daarom wordt je doorverwezen naar hun inschrijfpagina.

Het is belangrijk dat je onderaan de keuze "Ik wil lid worden van DSAV Dodeka (de studentenatletiekvereniging van AV'40 Delft)" aanvinkt.

Na het inschrijven heb je waarschijnlijk al een e-mail ontvangen van comcom@dsavdodeka.nl, met de vraag om je e-mail te bevestigen.

Klik op deze button om aan ons systeem te laten weten dat je e-mail klopt. Hierna zal je even moeten wachten, want het bestuur moet jouw aanmelding op onze website koppelen aan de ledenadministratie van AV`40. Zodra zij dit hebben gedaan, zul je bericht krijgen om je officieel te registreren op onze website. Dit zal verstuurd worden naar de e-mail die je hebt opgegeven op onze website (dus niet bij AV`40).

Registreren

Plaatje van de 'Registreren'-pagina.

In de e-mail vind je een link om je officieel te registreren op onze website. Dit is verplicht als je wil trainen.

Je kunt toestemming verlenen om aan andere leden je verjaardag en leeftijd te laten zien. Wie weet krijg je wel een heleboel felicitiaties!

Na het drukken op "Registreer" word je doorverwezen naar de website, waar je vervolgens kunt inloggen door rechtsboven op het icoontje te drukken.

Plaatje van icoontje rechtsboven op de website om in te loggen.

Leuke pagina's

Klassementen

Records

Commissiemail

We zijn recent overgestapt op een nieuwe provider, https://junda.nl.

Verander je wachtwoord

Via webmail.dsavdodeka.nl kun je het e-mailwachtwoord aanpassen. We raden het sterk aan om het wachtwoord gegeven door de .ComCom aan te passen! Dit doe je bij "Edit Your Settings -> Password & Security". Als je het hebt aangepast, moet je de onderstaande instructies opnieuw uitvoeren (dus zowel bij 'Mail sturen als' en 'E-mail bekijken' moet je het opnieuw instellen).

Stappenplan

Gebruikt je commissie al een @dsavdodeka.nl adres?

Gebruik je het via Gmail?

Vraag het nieuwe wachtwoord op door een appje aan de .ComCom te sturen. Stel het opnieuw in, zie de instructies hier.

Gebruik je het via mail.dsavdodeka.nl?

Vraag het nieuwe wachtwoord op door een appje aan de .ComCom te sturen. Ga dan naar webmail.dsavdodeka.nl, dat is de nieuwe plek waarbij je altijd bij je mail kunt! We raden je aan om het via Gmail te gebruiken, dat werkt over het algemeen wat fijner. Maak hiervoor een Gmail account aan (bijv. dies.dodeka@gmail.com) en volg de instructies hieronder.

Gebruik je nu een Gmail account?

Dan wordt het tijd over te stappen naar een @dsavdodeka.nl! Je kunt wel gewoon dezelfde inbox via Gmail blijven gebruiken.

Open Gmail op je PC of laptop.

Druk op het tandwieltje (Instellingen) rechtsboven naast je profielplaatje.
Druk op "Alle instellingen bekijken".
Kies het tabblad "Accounts en import". Het kan zijn dat dit is veranderd, je wil in ieder geval twee dingen vinden: "Mail sturen als:" en "Email bekijken uit andere accounts" (of iets vergelijkbaars).

Eerst passen we "mail sturen als" aan:

Mail sturen als:

Druk op "Nog een e-mailadres toevoegen"
Kies een naam. Vul het correcte @dsavdodeka.nl adres in. Bijvoorbeeld dies@dsavdodeka.nl. Dit moet het adres zijn dat je van de .ComCom hebt gekregen. "Beschouwen als alias" mag aangevinkt blijven.
Vul voor SMPT-server in: mail.dsavdodeka.nl met poort 465.
Gebruikersnaam: opnieuw je e-mailadres, dus bijv. dies@dsavdodeka.nl.
Wachtwoord: vul het wachtwoord in dat je van de .ComCom hebt gekregen (dit kan anders zijn dan je Gmail wachtwoord)
Wijzigingen opslaan. (Verder niets veranderen)
Hierna kun je bij "Mail sturen als" 2 opties zijn. Druk op "als standaard instellen" bij je nieuwe @dsavdodeka.nl adres. (bijv. dies@dsavdodeka.nl).
Verander (als het niet als zo was) "Bij het beantwoorden van een bericht" naar "Altijd antwoorden vanaf mijn standaardadres (nu dies@dsavdodeka.nl)"

Nu zorgen we dat je ook alles binnenkrijgt op je Gmail.

E-mail bekijken uit andere accounts:

Druk op "Een e-mailaccount toevoegen"
Daarna "E-mail importeren uit mijn andere account (POP3)"
Gebruikersnaam: Je hele e-mailadres (bijv. dies@dsavdodeka.nl)
Wachtwoord: vul het wachtwoord in dat je van de .ComCom hebt gekregen (dit kan anders zijn dan je Gmail wachtwoord)
Vul voor POP-server in: mail.dsavdodeka.nl met poort 995.
Vink "Een kopie van opgehaalde berichten op de server achterlaten."
Vink "Altijd een beveiligde verbinding (SSL) gebruiken wanneer e-mailberichten worden opgehaald (dit is belangrijk!)"
-> Account toevoegen

Test of het werkt door een e-mail naar ander adres te versturen en naar je @dsavdodeka.nl account te versturen. Als het niet werkt, neem contact op met de .ComCom.

Bestuur

Een aantal administratieve taken zijn via de website te regelen. De belangrijkste zijn:

De records zijn nog niet aan te passen via de admintool. Daar zijn we nog mee bezig. Neem contact op om ze handmatig aan te laten passen.

Nieuw lid accepteren

Update klassement

Rollen

Vragen en suggesties

.ComCom

Commissie

De huidige commissie is:

Matthijs Arnoldus
Tip ten Brink
Liam Timmerman
Jesper van der Marel
Tijmen Hoedjes (QQ B6)

Geschiedenis

Voormalige leden:

Laura Geurtsen
Donne Gerlich (QQ B2)
Nathan Douenburg
Aniek Sips (QQ B3)
Pien Abbink
Jefry el Bhwash (o.a. QQ B4)
Senne Drent
Leander Bindt
Sanne van Beek (QQ B5)

De commissie is opgericht in bestuursjaar 2 (2020/2021), door Matthijs, Jefry, Laura en Donne. Met Nathan als designer hebben ze in korte tijd een hele mooie website in elkaar gezet, die live ging in 2021. Dat jaar werd ook de eerste 24-uursvergadering gehouden. De website was vooral een bron van informatie en een uithangbord voor de vereniging, en natuurlijk het thuis van de Spike. Daarnaast was de .ComCom verantwoordelijk voor de e-mail.

De website was statisch, wat betekent dat er geen server was die data kon opslaan en dat je dus ook niet in kon loggen. Tip werd lid van Dodeka en ook meteen lid van de .ComCom in 2021/2022 en ging hard aan de slag om een "backend" te bouwen, waarmee wel ingelogd zou kunnen worden. Ook Pien werd lid en hielp mee met het bouwen van nog meer mooie pagina's.

Dit project ging live in januari 2023. Dat jaar krijgen we ook een nieuw lid, Leander, die nieuwe functies maakt voor de backend. Ondertussen waren Matthijs en Pien hard bezig om de content te onderhouden, iets wat een steeds grotere taak werd.

2023/2024 zag veel leden komen en gaan, terwijl de content beter dan ooit werd onderhouden door o.a. Jesper. Achter de schermen werd gewerkt aan de trainingsinschrijving, maar er ging bijvoorbeeld ook een systeem voor de klassementen live. Jefry, lid van het eerste uur, nam helaas afscheid. Met zijn opgedane kennis lukte het hem om voorzitter te worden van de Lustrumcommissie.

Wat kun je doen leren?

Bij de .ComCom is enorm veel te doen en dus natuurlijk ook heel veel te leren. De taken kunnen in de volgende vier belangrijke delen worden opgesplitst:

Design: de website moet er natuurlijk mooi uitzien. Dat is echter niet het enige, want de website moet ook gebruiksvriendelijk zijn. De user interface (UI), dus hoe de gebruiker omgaat met de website, en de user experience (UX), dus de ervaring van de gebruiker, moeten ook top zijn. UI/UX design is dus misschien nog wel belangrijker en daar is dus enorm veel over te leren.
Content: de website bevat naast nieuws belangrijke informatie over o.a. hoe je lid wordt en wanneer de trainingstijden zijn, ook bijvoorbeeld informatie over huidige en toekomstige wedstrijden en activiteiten. Die moeten constant geüpdate worden. De website is een uithangbord voor de vereniging en speelt een belangrijke rol bij het aantrekken van nieuwe leden. Dit maakt de "content" (de tekst en plaatjes) erg belangrijk. Je zult expert worden in het doorspitten van de FOCUS-archieven en het schrijven van tekstjes.
Programmeren: een groot deel van het werk dat bij de .ComCom wordt gedaan, is inderdaad simpelweg programmeren. Het is wel best anders dan je misschien gewend bent bij de vakken die je op de uni volgt. Het zijn namelijk geen Python plotjes of algoritmes in Java. Het gaat hier om het bouwen en onderhouden van een grote applicatie (die ondertussen 3+ jaar oud is) met meerdere componenten. Hieronder geven we nog meer detail.
Systeemadministratie: dit is niet het meest sexy taakje, maar wel heel belangrijk. We hebben nu een inlogsysteem, dus we bewaren o.a. e-mailadressen, geboortedata en andere privéinformatie. Daarnaast mogen de systemen niet zomaar uitvallen en moeten we de juiste keuze maken tussen aanbieders van servers.

Programmeren

Programmeren, coderen, developen, er zijn veel woorden voor wat wij doen. Je kunt het zelfs software engineering of architecture noemen. De volgende dingen bouwen wij allemaal:

Een website, de "frontend", (JavaScript, HTML, CSS) met het React framework. Naast statische pagina's komen er ondertussen steeds meer dynamische pagina's bij, die up-to-date informatie ophalen van een server, die vervolgens moet worden gemanipuleerd. Daarnaast moet worden bijgehouden of je wel of niet bent ingelogd en welke onderdelen van de website je kunt zien. Het bestuur moet een tabel kunnen zien met informatie over alle leden, en dit aan kunnen passen. Als data verandert op de server, moet je dat ook meteen kunnen zien.
Een server die een toegankelijk is via een API, de "backend" (geschreven in Python met FastAPI), die reageert op aanvragen (requests) van de website. Deze slaat alle wachtwoorden op in een vorm die wij niet kunnen lezen en kan bewijzen of je echt bent wie je zegt dat je bent. Dit vergt cryptografie. Daarnaast moet de backend informatie op kunnen halen uit een database. Hiervoor gebruiken we SQL. Hoe we inloggen en checken of iemand toegang heeft (authenticatie en autorisatie) volgt een bepaalde standaard, genaamd OAuth 2.

Systeemadministratie

Scriptjes en andere tools om alles te kunnen runnen, zoals Dockerfiles en Docker Compose bestanden die beschrijven hoe we de databases (PostgreSQL en Redis) opstarten.
Beheren van de emailprovider (nu runnen we dat zelf op onze server).
Beheren van de server zelf (Ubuntu Linux), updates uitvoeren, de toegang veilighouden.

Overview

The technical information is divided into four parts:

Setup: which only teaches how to get everything running on your local machine, so you can dive into the code and start developing right away.
Architecture: a deeper look at the architecture of the entire application, detailing why we made certain decisions and explaining things on a higher-level than you would find by looking at the comments and documentation in the source code.
Developing: details what kind of things you need to modify if you want to make changes. Contains tips on how to easily do actualy 'development', what files matter most, among other things.
Deployment: once you have developed something, it needs to actually go live. This section details all the steps you have to go through to deploy the code into the real world, how to administer the servers and related tasks.

Definitely look at Setup, and also at Deployment if you actually do an update. Otherwise, Developing should be what you look at next. Architecture is only necessary when you want to make big changes or understand things better.

Setup

Here you can find information on how to set everything up, both the frontend, backend and database, so you can start developing right away.

No matter what you do, you'll need to install Git, so check out the instructions for that.

If you are only doing things on the frontend, all you need to know is how to setup the frontend.

If you are developing the backend, you will probably want to also test things on the frontend, but you will definitely need to setup first the databases locally using a tool called Docker, and then you can setup the backend application itself.

Git

To share code with each other, we use a program called Git. We use Git to download the "online" version of the source code (which is hosted on GitHub) so that we can work on it locally on our own machines. We also use Git to again upload it to the server.

For an in-depth guide to Git, checkout the Pro Git book. For an overview of the most important commands, checkout GitLab's cheat sheet and this other one. Want to perform some specific action? Check out Git Flight Rules. Also, ChatGPT is pretty good at Git nowadays.

You can also use a GUI client instead of the command line (although I nowadays recommend against that), such as the one integrated into Visual Studio Code (with an extension such as GitLens), GitKraken or GitHub Desktop.

But first, you'll need to install Git itself. On Windows, use the link at the top here (the 64-bit standalone installer). On Linux, use your system package manager (instructions here). On macOS, use HomeBrew.

For the Windows installer, I recommend against adding the links to your context menu (so disable Windows Explorer integration). Git Bash Profile for Windows Terminal can be useful (if you've installed it). As default editor I recommend something like Notepad++ or Notepad. For the rest the default options should be okay. Be sure to use the recommended option for "Adjusting your PATH environment".

GUI

If you're using a GUI program (so GitKraken, VS Code or GitHub Desktop), use their documentation to login to GitHub (make sure you have an account). Go to the frontend or backend for further instructions.

Command line (recommended)

Why do I recommend using the command line? Because a lot of developer tools work with it exclusively. Because it's simple to develop one, they usually have the most features and offer you the most control. At the same time, they also usually allow you to make more mistakes and can have a steeper learning curve, although ChatGPT has made things a lot easier nowadays.

To be able to download and upload repositories, you'll still need to login to GitHub. For that, I recommend using the gh CLI. For Windows, I recommend just using the installer. The current version (as of December 2024), can be found here. To find newer versions, go to the releases page and download what's most similar to "GitHub CLI 2.63.0 windows amd64 installer" (amd64 means x86-64, if you're using an ARM chip you should install using WinGet, see here). For general installation instructions, see here.

If you haven't already, I recommend installing "Windows Terminal", which is much, much better than the standard Command Prompt. You can find it on the Microsoft Store (if the link doesn't work, just search for Windows Terminal). You might also want to install PowerShell 7.

Once you have gh installed (you might need to restart your terminal), run gh auth login and follow the steps to login to your GitHub account (make sure you have one!). Finally, go to the frontend or backend for further instructions.

Frontend setup

For more information on developing the frontend, see the section on developing the frontend.

Setting up the frontend is the easiest. The frontend is entirely developed and deployed from the DSAV-Dodeka/DSAV-Dodeka.github.io repository. Look at the instructions for Git if you haven't already.

Steps

Open the command line (PowerShell/Windows Terminal/Terminal) in a folder of your choice, where you want to install the code to
Run git clone https://github.com/DSAV-Dodeka/DSAV-Dodeka.github.io.git dodekafrontend --filter=blob:none, this creates a folder called "dodekafrontend" with all the code
Install a code editor (also called an IDE, Integrated Development Environment), VS Code is recommended
Install npm from NodeJS, this is used to install and run the website locally.

Windows-specific

Make sure WinGet is installed (run winget in the terminal and see if it shows an error). See instructions below if it is not installed.
Run the commands from the the NodeJS website to install npm/node
TODO finish

Other

Steps, explained in detail

Download the code

The first step is to "clone" (download) the repository to your computer. Because we store all our images inside the repository, the full history contains a lot of large images. In the past, we didn't properly optimize them so sometimes there were multiple versions of very large images. Thankfully, you don't have to download the full history. Instead, when cloning, run the following command (run it in some folder where you want to store it, the result will be a folder called 'dodekafrontend'):

git clone https://github.com/DSAV-Dodeka/DSAV-Dodeka.github.io.git dodekafrontend --filter=blob:none

The --filter=blob:none option executes a "partial clone", in which all blobs (so the actual file contents) of old commits are not downloaded. They are only downloaded once you actually switch to a commit. This might cause some occasional issues with GUI clients, so check out a branch first with the command line (git checkout <branch name>).

Node/`npm`

The next step is to install npm, which is the standard package manager for JavaScript. We use it to download all our dependencies. npm is included when you install NodeJS, which is a JavaScript runtime (a program that runs JavaScript code) based on the same internal engine as Google Chrome. We use that runtime to develop and build our project.

Download and install it from the NodeJS website. I recommend picking an LTS version (so v22 right now). v20 is also fine if you still have that.

Note, if you don't care about the installation taking a while or are not comfortable with the command line, you can use the installer instead. Otherwise, I highly recommend (if you're on Windows), to use the 'fnm' package manager option. To use the instructions, you should use a PowerShell terminal (best to use from Windows Terminal, download it from the Microsoft Store). You also need to install WinGet, which you can also get from the Microsoft Store through the App Installer (it's probably already installed, be sure to NOT download "APK Installer" or other non-Microsoft apps! See if it exists in the Microsoft Store, otherwise you can ignore the step. See here for details about the App Installer and here for WinGet).

If you're not on Windows, I recommend using the nvm option from the NodeJS website. Otherwise, you could also use pnpm instead.

Next, open the command line in the root folder of the project. This can be easily done by opening it in a IDE (integrated development environment, I recommend using VS Code) and then opening a terminal there. Then, to install all dependencies, run:

npm install

Running the website

Once this is done, we can actually run the website using the command:

npm run dev

Under the hood, this will use Vite to bundle and build our project into actual HTML, CSS and JavaScript that our browser can run. The command will start a dev server, which will allow you to access the website in your browser using something like localhost:3000 (the port, 3000 in this case, might be different).

Docker/container setup

We use the deploy folder in the dodeka repository for the setup of the relational SQL database and key-value store (a special database that is not relational, it's basically a big dictionary/map). For this, we use a technology called containers. Specifically, we use Docker.

In order to run the scripts, there are a few requirements. If you're on macOS or Linux, you only need to install Docker Engine as both are Unix-like systems. But you can also install Docker Desktop.

First of all, you need to have a Unix-like command line with a bash-compatible shell: i.e. Linux or macOS. See the next section for instructions on Windows Subsystem for Linux (WSL), which allows you to install Linux inside Windows.

Then, you need a number of tools installed, which you can install from the links below if you're not on Windows:

Docker Engine

If you're on Windows, installing Docker Desktop after you've installed WSL will make these available inside WSL if Docker Desktop is running.

WSL

If you're on Windows, the OS has simply too much differences to be able to run Linux containers directly. It therefore needs an additional virtualization layer. Thankfully, there now exists a technology called WSL (or Windows Subsystem for Linux).

You can find installation instructions here. I recommend installing either Ubuntu or Debian (Ubuntu is based on Debian, and all our containers run on Debian), which are two 'flavors/distributions' of Linux.

For a better experience with the command line, I recommend getting the Terminal application (not the built-in Command Prompt), which you can install from the Microsoft Store.

Local development

To be able to run everything, you need to have configured access to the containers. To do that, run:

docker login ghcr.io

Enter your GitHub username. For the password, don't use your GitHub password, but a Personal Access Token (Settings -> Developer settings) with at least repo, workflow, read:packages, write:packages, read:org and write:org permissions. Be sure to save the token somewhere safe, you'll probably have to reuse it and you can't view it in GitHub after creation!

Now, you will need to be able to access the scripts in this repository. If you're using Windows, do not copy the files from Windows to Linux, this leads to some weird formatting problems in the scripts that cause them to fail. Instead, clone this repository directly from WSL, by running:

git clone https://github.com/DSAV-Dodeka/dodeka.git

You will again need to enter your GitHub username and the Personal Access Token.

You will now have a dodeka folder containing all the necessary folders.

Now, we will use Docker Compose to start everything we want for development:

NOTE: Instead of the below commands, take a look at the shortcuts section if you're okay with installing an extra program (Nushell). Also if you're on WSL, please look at the last paragraph of the current section (use dev_port.env instead of dev.env).

First, we pull:

docker compose -f use/dev/docker-compose.yml --env-file use/dev/dev.env --profile data pull

We start:

docker compose -f use/dev/docker-compose.yml --env-file use/dev/dev.env --profile data up -d

We shutdown:

docker compose -f use/dev/docker-compose.yml --env-file use/dev/dev.env --profile data down

(To understand this command, we are basically indicating three options. -f is which compose file we are running, --env-file is which environment variables we are setting and --profile which services. We choose data because we only want to run the databases.)

Note that they will only be accessible on localhost from the environment they run in. So if you are using WSL, they are only available from localhost. So if the backend project is stored in Windows, you need to change the --env-file option to use/dev/dev_port.env. This will open the databases on the 0.0.0.0 host, which will make them accessible even from outside WSL.

Shortcuts

Those commands are quite long and hard to remember. To make things easier, we use Nu shell scripts to provide shorter commands that are easier to remember. First, follow the instructions on installing nu below.

On Windows, you need to prepend nu to every command. For convenience, the commands below all contain nu. However, on Linux/macOS this is not necessary, you can just run the script directly once you've installed Nushell (because they recognize shebangs).

The scripts are all in the root directory, but you can call them using ../ if you are in a subdirectory and they'll still work. Make sure you've installed Nu.

Running the development databases

Start (this will also pull the images, so make sure you're logged in with docker login ghcr.io):

Note: this opens the ports on host 0.0.0.0, so only do this when your Docker doesn't run on your main OS, so use this if it e.g. runs inside WSL

↓ the extra 'p' is on purpose

nu dev.nu upp

Stop:

nu dev.nu down

If running Docker on your main OS:

nu dev.nu up

Testing and checking the backend

(You can only run this after having followed the steps in the backend setup).

nu test.nu backend

Other commands can be found in the various .nu files in the root directory.

Installing Nushell

Install nu (you don't have to set it as your default shell, just make sure you have nu on your path).

If you have Rust installed, you can install it using cargo install nu. This is especially recommended if you're on Linux (to install it from a binary (which is much faster!), first run cargo install binstall and then cargo binstall nu). On macOS, install it using Homebrew (brew install nushell).

Why Nushell? In short, because it is a modern shell scripting language that lets you very easily call external programs. Its syntax is also readable by people who haven't used it before, but still quite powerful. It also can replace all kinds of different tools we might need otherwise.

Test database

Syncing the test database

A number of test databases are stored inside the DSAV-Dodeka/backend repository. Running the commands above creates an empty database. To populate it with the latest test values, run:

poetry run python -c "from use.data_sync.cli import run; run()"

You will probably need to set the GHMOTEQLYNC_DODEKA_GH_TOKEN as an environment variable for access. The safest way to set this is to add it to a file like sync.env:

export GHMOTEQLYNC_DODEKA_GH_TOKEN="GitHub Personal Access Token"

Here you should replace "GitHub Personal Access Token" with the value of your token, which will need repo scope. You can then run . sync.env before running the script to sync the database.

Simple backup

Ensure you are in the main dodeka directory, not in a subfolder.

To create a backup, run:

poetry run psqlsync --config data/test.toml --action backup

If the backup is running with a password, use the --prompt-pass option. You can then type/copy-paste the database password.

poetry run psqlsync --config data/test.toml --action backup --prompt-pass

From your own computer, you can then use ssh copy to transfer the file.

scp backend@<ip address>:/home/backend/dodeka/data/backups/backup-20230430-154532-dodeka.dump.gz <destination>

Backend setup

NOTE: the backend is now developed from the dodeka repository, in the backend subdirectory!

Before you can run the backend locally, you must have a Postgres and Redis database setup. Take a look at the database setup page for that.

Run all commands from the dodeka/backend folder!

Install uv. uv is like npm, but then for Python. I recommend using the standalone installer.
Then, set up a Python environment. Use uv python install inside the ./backend directory, which should then install the right version of Python, or use one you already have installed.
Next, sync the project using uv sync --group dev (to also install dev dependencies). This will also set up a virtual environment at ./.venv.
Then I recommend connecting your IDE to the environment. In the previous step uv will have created a virtual environment in a .venv directory. Point your IDE to that executable (the file named python or python.exe in .venv/bin) to make it work.
Currently, the apiserver package is in a /src folder which is nice for test isolation, but it might confuse your IDE (if you use PyCharm). In that case, find something like 'project structure' configuration and set the /src folder as a 'sources folder' (or similar, might be different in your IDE).
You might want some different configuration options. Maybe, you want to test sending emails or have the database connected somewhere else. In that case, you probably want to edit your devenv.toml, which contains important config options. However, this means that when you push your changes to Git, everyone else will get your version. If their are secret values included, those will be publicly available on Git as well! Instead, create a copy of devenv.toml called devenv.toml.local and make changes there. Now, Git will know to ignore this file.
Now you can run the server either by just running the dev.py in src/apiserver (easiest if you use PyCharm) or by running uv run backend (easiest if you use VS Code). The server will automatically reload if you change any files. It will tell you at which address you can access it.

Running for the first time

If you are running the server for the first time and/or the database is empty, be sure to set RECREATE="yes" in the env config file (i.e. devenv.toml.local). Be sure to set it back to "no" after doing this once, or otherwise it recreates it each time.

Architecture

This section describes all the different components that make up the website and how they interact. It describes why we made certain choices, which tools and frameworks we use, and more.

Frontend vs Backend

What is a "frontend", what is a "backend", why do we have this separation?

First, what is the difference? There is no perfect definition, but in general the "frontend" is the part that is exposed to the end user, so what they actually see and interact with. So the "frontend" is about the user, it's what actually runs in the browser.

The "backend" is the part that cannot run in the browser, for example because it needs to have dynamic access to the database. It runs on a server, away from the user. It's job is to store things that need to be secure not visible to everyone, like passwords or personal information. For that, it needs a database.

To use the backend, it exposes a so-called "API", which is basically a list of functions which can be called from the internet. In particularly, it mostly adheres to the principles of a RESTful API (there are many resources on the internet about it).

In general, when the frontend wants data, it sends a JSON request (a specific format used for structuring data) ... #TODO

There are two main reasons. The first, more technical one, is that by separating the frontend from the backend, you can develop them separately. So someone who wants to update how something looks, doesn't have to worry about any logic that should happen on the server. This allows teams to work more independently. It's also a "separation of concerns", which is ensures that there's not too much tangling of functionality. It also allows fully replacing one of the two components without worrying about the other. This might be useful in the future.

However, the more important ... #TODO

Frontend

AuthContext

Context is a way to share values throughout a React application without having to explicitly pass a prop throughout the whole tree. According to the official guide on context, 'the current authenticated user' is a good use-case.

The 'context' is a class that is very similar to a global useState. The createContext function from React creates the AuthContext object using a default value. Since we do not yet want to set its value, we define AuthUse (a type) containing authState and setAuthState and use an empty version of it as the default.

... see the code

The authState attribute of the AuthContext is an AuthState object, which we defined ourselves. This is a simple object, containing some basic information on the user and authentication status. We try to keep the objects immutable as much as possible.

We export the AuthContext.Provider, which is the component that will wrap our entire app, allowing each subcomponent to access the context.

We only want to set the default value once, once the application starts. Furthermore, the initialization requires asynchronous calls that can only be made at runtime. This can be tricky, so to prevent any problems we use the useEffect hook to initialize everything on the first render. The AuthProvider is initially initialized with an empty AuthState, which is then populated asynchronously.

The initialization uses our custom useAuth function, which contains most of the logic. We should write tests for this.

There are 3 tokens, ID token (from OpenID Connect, not actually used for authorization), access token (used for all authorized requests) and refresh token (used to refresh access and ID token). The ID token is transparent to the front end, meaning it is guaranteed we can read its data. It is used to see the username and other useful profile information to personalize the website. The "expiry" returned by a token request relates to the expiry time of the ID token (10 hours, 3600 s).

The useAuth function will check if it has stored tokens and parse the ID token value to populate the user attribute of the context's authState. If the token is expired, it will automatically request a new one using the refresh token. If it is missing, it simply assumes the user is not logged in.

Database

Everything is deployed from the dodeka repository. It also contains the "source" for the database (DB, PostgreSQL) and key-value store (KV, Redis).

The most important file is config.toml, which contains all practical configuration. In the build-folder you can find the source for all deploy scripts (build/deploy) and container build files (build/container). Using the confspawn tool the actual scripts are built from these templates. The results you can find in the various folders in the use directory.

Note on `confspawn`

The total configuration was spread over a lot of different files in different places (PostgreSQL config, various Docker Compose files...). Some configuration is also very project-dependent (names like 'dodeka'). To keep things more generable and have one single source of truth for the configuration, Tip developed a Python tool called confspawn which can take in a configuration "template" (where certain values are not filled in yet) and fill them in using a secondary configuration source. It uses Jinja2 templates for this.

Backend Server and authpage

Backend framework (Server): Python FastAPI server running on uvicorn (managed by gunicorn in production), which uses uvloop as its async event loop.

Frontend framework (authpage): React, built using Vite as a multi-page app (MPA) and served statically by FastAPI.

Persistent database (DB): PostgreSQL relational database.

In-memory key-value store (KV): Redis.

We use the async engine of SQLAlchemy (only Core, no ORM, we write SQL manually) as a frontend for asyncpg for all DB operations. Alembic is used as a migration tool.

The async component of the redis-py library is used as our KV client.

This is an authorization server, authentication server and web app backend API in one. This model is not recommended for large-scale setups but works well for our purposes. It has been designed in a way that makes the components easy to separate.

Client authentication uses the OPAQUE protocol (password-authentication key exchange), which protects against agains pre-computation attacks upon server compromise, as well as not relying on PKI (public key infrastructure) for protecting the password when it is sent over the network. This makes passwords extra safe in a way that they never leave the client.

Authorization is performed using OAuth 2.1, with much of OpenID Connect Core 1.0 also implemented to make it appropriate for authenticating users. While not technically required, OAuth tokens are generally in the form of JSON Web Tokens (JWTs) and OpenID Connect does require it, so we use them here. Good 3rd-party resources can be found for OAuth and JWTs.

In addition to this, we rely heavily on the following libraries:

PyJWT for signing and parsing JSON web tokens.
cryptography for many cryptographic primitives, primarily for encrypting refresh tokens and handling the keys used for signing the JWTs.
pydantic for modeling and parsing all data throughout the application.

The backend relies on some basic cryptography (see the cryptography section). It is nice to know something about secret key cryptography (AES), public key cryptography and hashing.

Why did we choose <x>?

FastAPI

FastAPI was selected because of its modern features reliant on Python typing, which greatly simplify development. FastAPI is built on Starlette, a lightweight async web server framework. We wanted a lightweight framework that is not too opinionated, as we wanted full control over as many components as possible. Flask would have been another option, but the heavy integration of typing in FastAPI made us choose it instead. Of course, there are also many other options outside the Python ecosystem. We chose to use Python simply because it is very well-known among university students.

Redis and PostgreSQL

PostgreSQL and Redis were selected simply by their popularity and open-source status. They have the most libraries built for them, have a large feature set and are widely supported. We chose a relational database because we do not need massive scaling and having relational constraints simplifies keeping all data in sync. For Redis, we use the RedisJSON extension module to greatly simplify temporarily storing dictionary-like datastructures for storing state across requests. Since there are a great many specific data types that need to be persisted, and they do not have any interdependency, this is much easier to do in an unstructured key-value store like Redis. It is also much faster than having to do this all in a structured, relational database. Note that all DB and KV accesses are heavily abstracted, the underlying queries could easily be re-implemented in other database systems if necessary.

We went all-in on async, expecting database and IO calls to make up the majority of response times. Using async, other waiting requests can be handled in the mean-time.

OAuth

Implementing good authentication/authorization for a website is hard. There are many mistakes to be made. However, many available libraries are very opinionated and hard to hook in to. Furthermore, the options become qutie limited when there is approximately no budget. There are some self-hosted solutions, but getting the configuration right can be very tricky and none were found that served our needs. As a result, we went for our own solution, but built using well-regarded web standards to ensure there are no security holes. OAuth is used by every major website nowadays, so the choice was easy.

OPAQUE

OPAQUE is an in-development protocol that seeks to provide a permanent solution to the question of how to best store passwords and authenticate users using them. A simple hash-based solution would have been good enough, but there are many (good and bad) ways to implement this, while OPAQUE makes it much more straightforward to implement it the right way. It also provides tangible security benefits. It has also been used by big companies (for example by WhatsApp for their end-to-end encrypted backups), so it is mature enough for production use.

Our implementation relies on opaque-ke, a library written in Rust. As there is no Python library, a simple wrapper for the Rust library, opquepy, was written for this project. It exposes the necessary functions for using OPAQUE and consists of very little code, making it easy to maintain. The wrapper is part of a library that also includes a WebAssembly wrapper, which allows it to be called from JavaScript in the browser.

Maybe implement in future

https://datatracker.ietf.org/doc/html/rfc8959 secret-token
https://datatracker.ietf.org/doc/html/rfc7009 token revocation request
https://auth0.com/docs/secure/tokens/json-web-tokens/json-web-key-sets JSON web key sets
https://datatracker.ietf.org/doc/html/rfc8414 OAuth 2 discovery
https://www.rfc-editor.org/rfc/rfc9068 Access token standard (also proper OpenID scope)
https://datatracker.ietf.org/doc/html/rfc7662 token metadata (introspection)

Performance

What is fast:

The actual HTTP server, which runs on uvloop. This won't ever be a likely bottleneck
The direct interface with the database: asyncpg is one of the fastest PostgreSQL adapters around

What is slow:

The parsing and loading of database data into Python (parsing into Pydantic models)
Manipulation of database data in Python
If we return a type directly, meaning FastAPI has to do additional conversion. Using JSONResponse directly is much faster

Most of the latter isn't a problem for simple responses that don't work on many rows. But if many rows are included, it might be worth it to write a parse function for a specific model and return a JSONResponse directly.

Cryptography

It's easy to make mistakes when you manually use cryptographic primitives. This project primarily uses cryptography for the purpose of signing and encrypting its tokens. If this is done incorrectly, the project is entirely insecure, because with forged tokens almost all data can be easily queried. Therefore, this document aims to properly document how cryptography is used in this project. Cryptography is also used for storing passwords, but this is almost entirely handled by a separate library, mostly using default settings.

Refresh tokens are only readable by the authorization/authentication part of the server and therefore can use the more secure and faster symmetric encryption. Since refresh tokens are not very standardized, this part is the most 'custom' and is the only part that uses cryptographic primitives. All these operations can be found in the auth/hazmat package. It's named hazmat because it uses the hazmat module of the Python library pyca/cryptography and to also signify that this code should be checked thoroughly. The pyca/cryptography library relies on OpenSSL (the underlying crypto implementation) binaries that are packaged together with the Python library. It is important to frequently update pyca/cryptography as OpenSSL receives frequent security fixes.

Refresh tokens

For our refresh tokens, we use AES encryption, specifically 256-bit encryption in GCM mode. If used correctly, GCM is one of the securer modes. However, if you misuse nonces/iv (random bytes used for every encryption to ensure the same plaintext looks different each time, ensuring no information leaks), some information could leak, such as the plaintext length. Care must therefore still be taken.

Our refresh tokens are simple, consisting only of a unique id, a family id and a random tag that makes it unique among its family. A 'family' is a set of refresh tokens descended from a single authentication. Therefore, we encode them as simple Python dicts (using pydantic) and our AES encryption thus works only on these dictionaries. crypt_dict.py provides the encryption and decryption for this.

We encode the dicts as JSON (as plaintext utf-8), generate a random 12-byte nonce (as recommended for AES-GCM) using the Python secrets.token_bytes function (which is recommended for such cryptographic use). We don't use any associated data (which would be unencrypted but could not be modified) as refresh tokens can exist in only a single context, so we simply encrypt using our initialized AESGCM object. This object must already contain the private symmetric key, which we assume is 256-bits (but technically could be also 128 and 192-bits). We simply concatenate the nonce and the encrypted data (which contains an authentication tag added by the pyca/cryptography, which ensures integrity of the data) and encode this in a string using base64url. Note that as it is not necessary, we do not add any base64 padding in the encoding step.

Our decryption works exactly the same, decoding it first, taking the nonce as the first twelve bytes, the data+tag what comes after. It then decrypts using an initialized AESGCM object and decodes the JSON into a dict. A lot can go wrong, but since all refresh tokens should have been generated by this application, we only provide an opaque 'DecryptError'. It is important to note that the base64url decoding simply ignores characters outside the alphabet. Furthermore, since it works with both padding and no padding there are multiple "encodings" of a single refresh token. The encoding holds no semantic value, so only do logic on the fully decrypted and decoded refresh token!

Access tokens (and id tokens)

Access tokens use asymmetric encryption and are JSON web tokens (JWTs). They come signed with a number of claims, meaning the resource server (which in this application is currently the same server as the auth server, but the code is mostly decoupled) simply has to check the validity of the token using a freely available public key. Anyone could check the validity of the claims inside the JWT.

We use EdDSA as our algorithm using the Ed448 curve. The latter technically offers better security than the slightly more standard Ed25519, but the difference is small. It was just a choice. EdDSA is used over other algorithms for its greater compactness. Note that this algorithm is not resistant to hypothetical advanced quantum computers, although we are very far from any quantum computer with enough power to break it. Note that AES is resistant to quantum computers.

To implement signing and verifying, we use the PyJWT library, which internally also relies on pyca/cryptography (and therefore on OpenSSL) for its cryptography. sign_dict.py takes care of signing the token passed as a dict. Since the authorization server would never have to verify an access token, we implement that inside our application (apiserver/lib/hazmat/tokens.py), not inside the decoupled auth component.

Passwords (OPAQUE)

We store passwords using the OPAQUE protocol (see README). This library uses some asymmetric encryption (and other smart stuff) so we can have our cake and eat it too. We don't handle any password on the server, and we are also protected against pre-computation attacks (where using the provided salt an attacker precomputes a password dictionary before taking control of the server). See the opaquepy library (maintained privately by Tip) for details.

Keys

We use an asymmetric key (a public verification key and a private signing key), as well as a symmetric (private) key. Our symmetric key is simply encoded using base64url, while our public key (due to requirements of PyJWT) uses a more complicated scheme, namely in PEM format, using PKCS#8 for the private key and X509PKCS#1 for the public key. These are standardized schemes. We wrap them in our own structs to make handling easier and, they all include a kid (key id), to make it possible to store multiple in a database.

Auth

Specification

We use the Authorization Code Flow according to RFC6749 Section 4.1, as recommended by Internet Draft OAuth 2.0 for Browser-Based Apps, since the frontend application can be recognized as a Javascript Application without a Backend (Section 6.3 of the latter document). This might be confusing as we do have a backend, but backend in that context means a frontend backend that dynamically serves HTML, while our frontend is hosted as static HTML on GitHub Pages.

We comply fully with OAuth 2.1 and implement an Authorization Code Flow with PKCE. We comply as much as possible with OpenID Connect, except on some points that are only for interoperability (like supporting certain algorithms), which we do not require. Our compliance with OpenID Connect is not as rigorously checked, as it is a more complex standard. We used it more as a guide.

Useful resources

https://auth0.com/docs/security/tokens/refresh-tokens/refresh-token-rotation
- Other pages from https://auth0.com/docs
https://www.ietf.org/archive/id/draft-ietf-oauth-v2-1-04.html
https://openid.net/specs/openid-connect-core-1_0.html
https://www.oauth.com/

Protocol

Resource Owner - end-user

Resource Server - dodekabackend=apiserver (Server)

identifier: dodekabackend_client

Client - dodekaweb (Pages)

identifier: dodekaweb_client
redirect_uri: .../auth/callback

Authorization Server (AS) = OpenID Provider (OP) - dodekabackend=apiserver (Server)

The Relying Party (RP) in the context of OpenID Connect is the Client.

Create an Authorization Request (Section 4.1.1)

(Client) AuthRedirect.tsx generates the AuthRequest:

  {
    "response_type": "code",
    "client_id":  "dodekaweb_client",
    "redirect_uri":  ".../auth/callback",
    "state": "a) STATE",
    "code_challenge": "b) CHALLENGE",
    "code_challenge_method": "S256",
    "nonce": "c) NONCE"
  }

a) STATE: The state is a randomly generated string that is used to later check the response
b) CHALLENGE: First a code verifier is generated, which is cryptographically random. A SHA256 hash is then computed (as indicated by 'S256'). The verifier is stored locally. Used as a check by the server.
c) NONCE (OpenID): Randomly generated used to later verify the OpenID ID token. The random value is stored, the hash is sent.

(Client) The AuthRequest is encoded as an urlencoded param string, not JSON. The user is redirected to the AS with those params, specifically to ../oauth/authorize
(AS) The Authorization Server (AS) validates the AuthRequest, in particular the response type (only "code"), the client_id, the redirect_uri and the format of the state, challenge and nonce. It generates a random identifier (the Flow ID). It uses this identifier to persist the process on the server by the storing the entire AuthRequest as a JSON. This is needed in order to later check everything.
(AS) The user is redirected to a new page. In a perfect world, this would be a page served by the server, but in this case this is a page on the Client (../auth/credentials). The Flow ID is sent as a query parameter.
(AS/Client) The next step does not fall under the OAuth protocol, as any AS is free to choose its own authentication implementation. In this case we use the OPAQUE protocol. A user supplies their username and password, which is then used with the ../auth/login/start and ../auth/login/finish endpoints to ensure the password is correct.
(AS) In the final ../auth/login/finish step, the user also supplies the Flow ID. The authorization code (OPAQUE "session key") which is computed is used as a key for storing the combination of the Flow ID, username and authentication time. This is stored only for a short time.
(AS/Client) Finally the user is redirected to the ../oauth/callback endpoint, again with the Flow ID but now also with the generated authorization code. This is separated from the ../auth/login/finish to neatly distinguish the steps belonging to the OAuth protocol and the selected authentication protocol. Here, the user is redirected using the state from the AuthRequest stored on the server and authorization code to the redirect_uri supplied in the initial request.
(Client) At the redirect_uri (../auth/callback), the client checks their stored state with the state supplied in the redirect. If it doesn't match, the login aborts.
(Client) Now, a TokenRequest is made:

  {
    "client_id":  "dodekaweb_client",
    "grant_type": "authorization_code",
    "redirect_uri":  ".../auth/callback",
    "code": "a) CODE",
    "code_verifier": "b) VERIFIER"
  }

a) CODE: The code that the user generated in the final authentication step (OPAQUE session key).
b) VERIFIER: The unhashed original cryptographically random string generated by the client for the original AuthRequest

(Client) This time, it is encoded as JSON and sent as a post request to the ../oauth/token endpoint.
(AS) The supplied authorization code is used to fetch the Flow ID

Security considerations of non-OAuth steps

OAuth does not exactly define authentication, nor how to store certain state. We make a few assumptions that determine the security of the login process:

The Flow ID (used to identify the OAuth AuthRequest throughout the entire authentication flow), the Auth ID (used to identify an OPAQUE login process, i.e. to retrieve server state generated in the first step for the second step) and the session key (used as the OAuth 'code', generated by OPAQUE) should all be sufficiently random and have enough entropy so that an attacker cannot guess a random value to intercept a login attempt.
Critically, in the time frame that these are valid (1000 seconds for Flow ID and 60 seconds for Auth ID and session key), there should not be so many requests that an attacker can randomly guess a correct value. All these values are at least 10 bytes of random information, meaning there are more than 10^24 different values. The session key is even 32 bytes, making it even more difficult to guess within 60 seconds.

https://openid.net/specs/openid-connect-core-1_0.html#IDToken

Remembering session

Key rotation

Key management and rotation is not a trivial problem.

https://datatracker.ietf.org/doc/html/rfc7517#appendix-C.3 https://cryptography.io/en/latest/hazmat/primitives/asymmetric/serialization/#serialization-encodings https://www.rfc-editor.org/rfc/rfc8037.html#section-2

https://www.rfc-editor.org/rfc/rfc7518.html https://www.rfc-editor.org/rfc/rfc7517.html JWK

We will store the keys as an encrypted JSON Web Key Set, encrypted with a runtime key (from the dodeka secrets). Keys will be regenerated automatically.

The opaque setup value will also be rotated automatically.

At startup, the encrypted JSON Web Key Set will be extracted from the database. It will then be decrypted and the value replaced.

Developing

Prerequisites

Development isn't scary, but it's probably new. Lots of jargon and all kinds of tools will be thrown around. Just let it come, you can only learn by doing.

Below I'll introduce some concepts that are necessary to understand to develop for the website. Some of it you'll know about or heard of already.

File system

Servers

Linux

Command line

Version control / Git

Browsers / JavaScript

Package managers

NodeJS / npm

Cheatsheet

This contains the most important information to get you up and running and productive.

Git

Open the command line in the folder where you downloaded DSAV-Dodeka.github.io / open the terminal in VS Code (Ctrl+`)

General workflow

# go to the main branch
git checkout main
# update the repository
git pull
# go to a new branch (replace branchname with your desired name, no spaces or capital letters allowed)
git switch -c "branchname"
# add all edited files to future commit
git add -A
# commit the changes (change the description to something useful)
git commit -m "commit description"
# upload changes to github.com (replace 'branchname' with what you used earlier)
# in case you already pushed this branch before, you can just do git push
git push --set-upstream origin branchname

Status

See the current status (shows what branch you are on)

git status

Go to a branch

If you want to go to a particular branch, say 'branch-xyz', do:

git checkout branch-xyz

Update a branch

If you want to update your current branch with changes on github.com:

git pull

If you have changes locally, this might not work.

Delete all local changes (BE CAREFUL)

If you did some stuff you don't know how to revert, but also don't care to save it, do (be careful!):

git reset --hard

Frontend

Open the command line in the folder where you downloaded DSAV-Dodeka.github.io / open the terminal in VS Code (Ctrl+`)

Run the website locally

npm run dev

The website is now available in your browser at http://localhost:3000.

Update dependencies

npm install

Frontend development

For more in-depth details on why certain decisions were made, see the architecture section.

The frontend development can be divided roughly into three sections:

Updating the content (don't modify pages, just the text and images). See here. For information on images and how to optimize all images, see here.
Adding new static pages (design-focused page design). See here.
Creating dynamic pages (and integrating them with the backend). See the section in the backend here.

React and React Router

We use a concept called "client-side routing", which means that code on the page itself sends you to the subpages. This is also known as a "single-page application". For more details, see the section on architecture.

Routes

Define a new route

To define a route, add a an element inside teh element in src/App.tsx:

<Route path="/vereniging" element={<Vereniging />} />

Here <Vereniging /> is the React component that consists of the entire page. You will need to import it from the right file. Every single page, also every subpage, needs a separate route like this. path="/vereniging" indicates the path at which the page will be visible.

The src/components/Navigation Bar/NavigationBar.jsx file contains all the different menu items. Don't forget to add it to both the navItems and the navMobileContainer.

.tsx vs .jsx?

For new components, prefer .tsx, which ensures proper TypeScript checking happens, which can make development easier by providing hints about what properties are available and also prevents bugs.

Authentication

We use React Context to make the authState available everywhere. Inside the component, simply put:

const {authState, setAuthState} = useContext(authContext)

Then, by checking authState.isLoaded && authState.isAuthenticated you can check whether someone has been authenticated as a member and whether the route should be available. Note: any data that you store on the frontend is publicly available (either through the source code, but also using 'inspect element' in the browser)! So any sensitive data should be stored in the backend and retrieved using requests. See the section on integrating the backend and frontend.

You can use authState.scope.includes("<role>") to check if someone has a role, but remember someone can just edit this code in the browser. So any information available on pages stored in the frontend repository should nto be sensitive. So it's fine to simply display the page skeleton based on checking the authState, but don't show private data based on that.

Content

The content can be found primarily in ./src/content. There you can find many JSON files. JSON (JavaScript Object Notation) is a format that can easily be read by a machine. In the actual pages, we import these files and then read them, putting the text actually on the website.

The images can be found in ./src/images. Again, we import these images on pages using the getUrl function.

Images

Optimizing images

This script only works on Linux (or WSL).

Dependencies

img-optimize - https://virtubox.github.io/img-optimize/ (optimize.sh)
imagemagick - https://imagemagick.org/script/download.php (convert)
jpegoptim
optipng
cwebp

The last 3 can be installed on Debian/Ubuntu using:

sudo apt install jpegoptim optipng webp

Once you've downloaded the first script, run the following script from the img-optimize main folder (be sure to replace <DSAV-Dodeka repository location> by the correct path):

#!/bin/bash
# Script by https://christitus.com/script-for-optimizing-images/ (Chris Titus)
# Modified by Tip ten Brink

FOLDER="<DSAV-Dodeka repository location>/src/images"

#resize png or jpg to either height or width, keeps proportions using imagemagick
find ${FOLDER} -iname '*.jpg' -o -iname '*.png' -exec convert \{} -verbose -resize 2400x\> \{} \;
find ${FOLDER} -iname '*.jpg' -o -iname '*.png' -exec convert \{} -verbose -resize x1300\> \{} \;
find ${FOLDER} -iname '*.png' -exec convert \{} -verbose -resize 2400x\> \{} \;
find ${FOLDER} -iname '*.png' -exec convert \{} -verbose -resize x1300\> \{} \;
# Optimize.sh is the img-optimize script
./optimize.sh --std --path ${FOLDER}

We convert the images to a size of max 2400x1300, as higher resolutions don't make a big difference.

Backend development

Project structure

All the application code is inside the backend/src directory. This contains five separate packages, of which three act as libraries and two as applications.

The datacontext library is fully standalone. It contains special logic for implementing dependency injection, which is useful for replacing database-reliant functions in tests, while keeping good developer ergonomics. Ensure it doesn't import code from any other package!
The store library is fully standalone and provides the primitives for communicating with the databases (both DB and KV). Ensure it doesn't import code from any other package!
The auth library relies on both the datacontext and store libraries. It provides an application-agnostic implementation of all the authorization server logic. In an ideal world, the authorization server is a separate application. To still stay as close to this as possible, we develop it as a separate library. However, the library does not know about HTTP or anything like that, the routes are implemented in our actual implementation, as are some things which rely on a specific schema.
The schema package contains the definition of our database schema (in schema/model/model.py). It can be extracted during deployment and then used for applying migrations, hence it is also something of an application.
The apiserver package is our actual FastAPI application. It relies on all the above four packages. However, it also has some internal logic that is more "library"-like. Furthermore, to prevent circular imports among other things, there is a certain "dependency order" we want to keep. They are as follows:
- resources.py contains two variables that make it easier to get the specific path, specifically import files in the resources folder.
- define.py contains a number of constants that are unlikely to ever change and do not really depend on what environment the application is deployed in (whether it is development, staging, production, etc.). It also contains the logic for loading things that do depend on the 'general' environment, but not the 'local' environment. As a rule of thumb, something like a website URL will always be the same for an environment, but an IP address, a port or a password might differ.
- env.py loads this local configuration, which includes things like passwords and where to exactly find the database.
- Then we have the src/apiserver/lib module, which consists mostly of logic that does not load its own data. While it might cause side effects (like sending an email), it should always cause the same side effects for the same arguments (so it should not load data). In general, most functions and logic here should be pure. More importantly, they should not import anything from the src/apiserver/app module.
- Next there is src/apiserver/data. This include all the simple functions that perform a single action relating to external data (so the DB or KV). Mostly, these functions wrap store functions, but then using a specific table or schema. The most important are the functions in the data/api, i.e. the data "API" which is the way that the rest of the application interacts with data. Inside data/context it also contains context functions, which should call multiple data/api functions and other effectful code that you wan to easily replace in test (like generating something randomly). See data/context/__init__.py for more details.
- Finally, we come to src/apiserver/app. These contain the most critical part, namely the routers, which define the actual API endpoints. Furthermore, there is the modules module, which mostly wrap multiple context functions. See app/modules/__init__.py for more details.
- Next, the app_... files define and instantiate the actual application, while dev.py is an entrypoint for running the program in development.

Other

Important to keep in mind

Always add a trailing "/" to endpoints.

Testing

We have a number of tests in the tests directory. To run them and check if you didn't break anything important, you can run poetry run pytest.

Static analysis and formatting

To improve code quality, readability and catch some simple bugs, we use a number of static analysis tools and a formatter. We use the following:

mypy checks if our type hints check out. Run using poetry run mypy. This is the slowest of all the tools.
ruff is a linter, so it checks for common mistakes, unused imports and other simple things. Run using poetry run ruff src tests actions. To automatically fix issues, add --fix.
black is a formatter. It ensures we never have to discuss formatting mistakes, we just let the tool handle it for us. You can use poetry run black src tests actions to run it.

You can run all these tools at once using the Poe taskrunner, by running the following in the terminal:

poe check

Continuous Integration (CI)

Tests (including some additional tests that run against a live database) and all the above tools are all run in GitHub actions. If you open a Pull Request, these checks are run for every commit you push. If any fail, the "check" will fail, indicating that we should not merge.

VS Code settings

VS Code doesn't come included with all necessary/useful tools for developing a Python application. Therefore, be sure the following are installed:

Python (which installs Pylance)
Even Better TOML (for .toml file support)

You probably want to update .vscode/settings.json as follows:

{
    "python.analysis.typeCheckingMode": "basic",
    "files.associations": {
        "*.toml.local": "toml"
    },
    "files.exclude": {
        "**/__pycache__": true,
        "**/.idea": true,
        "**/.mypy_cache": true,
        "**/.pytest_cache": true,
        "**/.ruff_cache": true
      }
}

This ensures that any unnecessary and files are not shown in the Explorer.

Routes

authpage

The authpage is the tiny React webpage that is used for logging in. It is served directly by the backend server and can be found in the backend repository.

Schema

Migrations

We can use Alembic for migrations, which allow you to programatically apply large schema changes to your database.

First you need to have the Poetry environment running as described earlier and ensure the database is on as well.

Navigate to the ./backend/src/schema directory.

Ensuring the database is in sync with the migrations

The migrations are stored in the schema/model/versions folder. First, make sure your database has the same schema as the latest revision.

From there run poetry run alembic revision --autogenerate -m "Some message"
This will generate a Python file in the migrations/versions directory, which you can view to see if everything looks good. It basically looks at the database, looks at the schema described in db/model.py and generates code to migrate to the described schema.
Then, you can run poetry run alembic upgrade head, which will apply the latest generated revision. If you now use your database viewer, the table will have hopefully appeared.
If there is a mismatch with the current revision, use poetry run alembic stamp head before the above 2 commmands.

Integrating the backend/frontend

The database is the only place you can securely store private information. Everything stored in the repositories can easily be accessed by anyone. In the future, we might want to make an easier way to store private content.

So, if you want to display some secret information on the backend on the frontend, you will need to load it using an HTTP request. To make this easier, we use two libraries, TanStack Query and ky.

A query

A "query" is basically an automatic function that, once the page loads, will load whatever function you ask it to and keep it up to date. It can be enabled based on whether or not someone is authenticated.

An example of of a query is (defined in src/functions/queries.ts):

export const useAdminKlassementQuery = (
  au: AuthUse,
  rank_type: "points" | "training",
) =>
  useQuery(
    [`tr_klass_admin_${rank_type}`],
    () => klassement_request(au, true, rank_type),
    {
      staleTime: longStaleTime,
      cacheTime: longCacheTime,
      enabled: au.authState.isAuthenticated,
    },
  );

Here, useQuery is a function from the TanStack Query library, while klassement_request is defined by us. Here it is important that the tr_klass_admin_${rank_type} key is unique, otherwise the caches will not work correctly.

Let's look at klassement_request (inside src/functions/api/klassementen.ts):

export const klassement_request = async (
  auth: AuthUse,
  is_admin: boolean,
  rank_type: "points" | "training",
  options?: Options,
): Promise<KlassementList> => {
    
    # ... ommitted for brevity

  let response = await back_request(
    `${role}/classification/${rank_type}/`,
    auth,
    options,
  );

  const punt_klas: KlassementList = KlassementList.parse(response);

  # ... ommitted for brevity

  return punt_klas;
};

Here, back_request is a function we defined, which calls the backend using the ky library. Basically, this is just a simple GET request, which we then parse using zod (the .parse part). The backend will check the information in the auth part, returning the data if you have the right scope and are the right user.

How is the result of this query used now? Let's see (in src/pages/Admin/components/Klassement.tsx):

const q = useAdminKlassementQuery({ authState, setAuthState }, typeName);
const pointsData = queryError(
  q,
  defaultTraining,
  `Class ${typeName} Query Error`,
);

The pointsData now simply contains the data you want. All the data loading happens in the background. queryError is also defined by us and ensures that any potential error is properly caught. If the data is still loading, it will display the default data instead (defaultTraining) in this case. In the future we might want to make sure this is displayed in a nicer way in the UI. Because right now, it will first show the default data, before flickering and switching to the loaded data once it comes in.

Developing the deployment setup

Deployment

This section discusses deployment, so everything to do with getting the code actually live so that the application can actually be used.

Source

Building the scripts and containers

Poetry
- Once installed, run poetry install --sync inside the main directory. This will install the other requirements.

Deploy scripts

Building the deployment scripts is easy, simply run build_deploy.sh in the main directory.

Containers

The containers have dedicated GitHub Actions workflows to build them, so in general you should never have to build them locally. Take a look at the workflows to see how they are built.

Server

Our server is hosted by Hetzner, a German cloud provider. Our server is an unmanaged Linux Ubuntu virtual machine (VM). VM means that we do not have a full system to ourselves, but share it with other Hetzner customers. We have access to a limited number of cores, memory and storage.

It is unmanaged because we have full control over the operating system. We need to keep it up to date ourselves. The choice for Ubuntu was also made by us. It has no GUI, only a command line, so getting familiar with the Unix command line is very helpful. By default, it uses the bash shell.

Webportal

The webportal for the server can be accessed from https://console.hetzner.cloud. The account we use is studentenatletiek@av40.nl. You need 2FA to log in.

The most important things you can do from the portal are:

See graphs of CPU, disk and network load, as well as memory usage
Manage backups
Access to root console

Connecting: SSH

To connect to the server, we use SSH. To be able to connect, you need to have an "SSH key" configured. To add one, you must first generate a private-public SSH keypair.

Then, the public part must be added to the ~/backend/.ssh/authorized_keys file. Note, this file requires some specific permissions, so if something is not working check whether these are correct.

Currently, we have the following important SSH settings:

PermitRootLogin no
PasswordAuthentication no

This ensure you can only log in with an SSH key, not with a simple user password.

We only allow logins through the backend user (see next section), so keep /root/.ssh/authorized_keys empty.

It is recommended to not add SSH keys through the web console, as these are not easily visible inside the authorized_keys file.

If you no longer have access to any keys, use the web console to log in as root, then change user to backend su backend and edit the authorized_keys file.

Connecting

Connecting is simple, simply do ssh backend@<ip address>. If your default identity is not a key that has access, you might need to use the -i flag to select the right key on your client.

Once you have done this, you have access to the server as if it's your PC's own command line.

`root` vs `backend` user

To keep things safe, try to avoid using the root user as much as possible. Instead, use backend. You can use sudo to run priviledged commands and and if necessary, log in as root using su root.

Keeping it up to date

To keep the server up to date, ocassionally run:

sudo apt update
sudo apt install

Ocasionally, Ubuntu itself might also get an update. It is best to only update once there is a new LTS version.

Required packages

For running deployment scripts, three main tools must be installed, poetry, Docker (including Docker Compose) and the GitHub CLI. Make sure these are occasionally updated.

Furthermore, the server also requires nginx as a reverse proxy and certbot for SSL certificates. We use Ubuntu's packaged nginx and we use a snap package for certbot.

File locations

Currently, mailcow is in /opt/mailcow-dockerized and the dodeka repository is in /home/backend/dodeka.

Environments

The dodeka repository builds Docker containers (the build/container directory) and builds the scripts for deploying them (build/deploy). Both are differentiated based on the "environment" or "mode" of deployment. We distinguish the following:

'production' mode
'staging' mode
'test' mode (not yet in use)
'localdev' mode

There is also 'envless', for when you run tests without actually spinning up the entire application.

The DB (database, PostgreSQL) and KV (key-value store, Redis) are designed to vary very little depending on their mode, accepting simple configuration options and allowing to be wrapped by simple scripts to handle different modes.

The Server and Pages have more significant differences between modes.

In general, modes are pre-selected for deploy builds, but for container builds they are selected at build time (in CI). Furthermore, deploy builds can generally be run locally, while container builds are run in CI to ensure reproducability.

Staging and production

For a complete setup including backend, first ensure the containers are built using GitHub Actions for the environment you want to deploy. Then, SSH into the cloud server you want to deploy it to.

The following tools are necessary, in addition to those assumed to be installed on a standard Linux server:

Python 3.10 (matching the version requirements of the dodeka repository)
Poetry (to install the dependencies in the dodeka repository)
gh (GitHub CLI, logged into account with access to dodeka repository, optional if authenticated by other means)
nu (a shell scripting language, see also how to install.)
tidploy (tool built specifically for deploying this project)
bws (Bitwarden Secrets Manager CLI)

The last three tools are all Rust projects, so they can be built from source using cargo install <tool name>. However, this can be very slow on large VMs, so installing them as binaries using cargo binstall <tool name> is recommended (install binstall first by using cargo install cargo-binstall).

To deploy, simply clone this repository and enter the main directory. Make sure you have updated the repository recently with the newest deploy script versions. Then, run tidploy auth deploy and enter the Bitwarden Secrets Manager access token.

Then, you can deploy using tidploy deploy production (or tidploy deploy staging for staging). You can also use the Nu shortcut:

nu deploy.nu create production

If you want to specify the specific Git tag (i.e. a release or commit hash) to deploy, use:

nu deploy.nu create staging v2.1.0-rc.3

That's it!

Shutdown

You can observe the active compose project using docker compose ls. Then you can shut it down by running (from the dodeka repository):

nu deploy.nu down production

If the suffix of the Docker Compose project name is different from latest, replace it with that. It will in general be equal to the tag you deployed with (which is 'latest' by default), but with periods replaced with underscores. For example, nu deploy.nu create staging v2.1.0-rc.3 can be shutdown with nu deploy.nu down staging v2_1_0-rc_3.

Setting up the server from scratch

This is all the commands I used when I set up the server from a clean image on January 10 after we migrated the email to the new provider.

Preparing SSH

First we rebuild the image from Ubuntu 22.04.

We reset the root password using Rescue -> Reset root password. I recommend then changing it to a new password once inside again (through passwd root).

Go to /etc/ssh and change the SSH server settings sshd_config:

PermitRootLogin no
PasswordAuthentication no

Then we create a new user (-m creates home directory, then we add them to sudo group):

useradd -m backend
adduser backend sudo

Then do passwd backend and set up a password.

Switch to the user using su backend.

The default shell might not be bash (for example if your prompt starts with only '$'), in that case run:

chsh --shell /bin/bash

Create a new directory mkdir /home/backend/.ssh. Enter this directory (using cd) and then do nano authorized_keys to open/create a new file there.

Paste in your SSH public key (!) (something like ssh-ed25519 .... tiptenbrink@tipc) and save the file (Ctrl-X).

Then, ensure the file has the correct permissions:

chmod 700 ~/.ssh && chmod 600 ~/.ssh/authorized_keys

SSH niceties

Install "xauth" (while logged in as root)

apt install xauth

Then, login to backend (su backend) and log out again. If you're using a nice terminal emulator, like kitty, you need to add the xterm file to the server. To do this, one time append kitty +kitten before your ssh command like:

kitty +kitten ssh backend@<ip here>

After that you can just login normally. Other terminal emulators might need other instructions.

From this point on we never need to be logged in as root anymore!

Update packages

sudo apt update
sudo apt upgrade

You might have to reboot after this: reboot.

Install basic C ompiler

sudo apt install build-essential

Install Rust

Go to https://rustup.rs/

Run the listed script. Choose "Customize", then profile "minimal".

Install cargo binary install tools

cargo install cargo-quickinstall
cargo quickinstall cargo-binstall

Install nu, tidploy

cargo binstall nu
cargo binstall tidploy

Install GH CLI

Follow these instructions.

Using its password.

Setup `gh`

Add `backend` user to Docker group

sudo usermod -aG docker backend

Then logout and back in again.

docker login ghcr.io

Use another access token, this one only has to read the org and have access to packages.

Clone `dodeka`

gh repo clone dsav-dodeka/dodeka

Set tidploy auth key

Now, ensure all necessary secrets are accessible by the access token you're going to set. Then enter the dodeka repository and do:

tidploy auth bws

Then enter the access token.

Deploy

Next, run:

tidploy deploy -d deploy/use/production

This will start the backend and database.

Optional: restore database

In case you have all the files from the volume that contained the database, you want to restore these. First, get them to the server, for example using scp. We assume we have a folder called backup in our current directory that contains all the Postgres files. Ensure the database is down again (using docker compose -p dodeka-production-latest down).

Then you can do:

docker run --rm -it -v ./backup:/from -v dodeka-db-volume-production-latest:/to alpine ash

This will put you into a container with the recently created, empty database at /to and the backup at /from. First, clean out the new folder using rm -rf * while inside the /to folder (don't do this in the container root directory!).

Then, you can run:

cd /from ; cp -av . /to

Now, restart the database. Everything should work now.

Setup nginx

Install it:

sudo apt install nginx

Start it:

sudo systemctl start nginx

For some extra details also see the full section on nginx and certificates.

Setup non-HTTPS config

Go to /etc/nginx. Every file here is root-protected, so use sudo before each of the following commands:

Go to the sites-available subdirectory.

Do nano api.dsavdodeka.nl and paste the following basic config:

server {
    root /var/www/api.dsavdodeka.nl/html;

    index index.html index.htm index.nginx-debian.html;

    server_name api.dsavdodeka.nl;

    location / {
            # First attempt to serve request as file, then
            # as directory, then fall back to displaying a 404.
            proxy_pass http://localhost:4241;
            proxy_http_version 1.1;
            proxy_set_header Upgrade $http_upgrade;
            proxy_set_header Connection 'upgrade';
            proxy_set_header Host $host;
            proxy_cache_bypass $http_upgrade;
    }

}

server {
    listen 80;
}

Create a symlink from the available to enabled:

sudo ln -s /etc/nginx/sites-available/api.dsavdodeka.nl /etc/nginx/sites-enabled/api.dsavdodeka.nl

If necessary, restart nginx:

sudo systemctl restart nginx

If you go to http://api.dsavdodeka.nl (not https) you should get "Hallo: Atleten"!

Certbot/Let's Encrypt

Install snap

sudo apt install snapd

Install certbot

sudo snap install --classic certbot

Run certbot

sudo certbot --nginx

First, enter your e-mail. Then it will give you a list of domains you want to install the certificate for, choose the number indicating api.dsavdodeka.nl (probably 1).

Cleanup nginx config

You probably want to clean up your nginx config.

There might be a server block saying only:

server {
    listen 80;
}

Delete this, the Certbot block should handle this now.

If necessary, restart nginx:

sudo systemctl restart nginx

Optional: install Python and Poetry

You can create database backups and migrate the database using Python. First, we need to install a Python version that has the same major version as the backend server requires.

To make it more easy to install new versions in the future, let's use pyenv. I recommend not installing using homebrew, as that might interfere with some other core packages. Instead, use their install script and follow the instructions to put it into the path. These were, when last checked:

curl https://pyenv.run | bash

To add it to path and load it automatically: add to .bashrc (in the server home folder):

export PYENV_ROOT="$HOME/.pyenv"
[[ -d $PYENV_ROOT/bin ]] && export PATH="$PYENV_ROOT/bin:$PATH"
eval "$(pyenv init -)"

Then, install the correct Python version using:

pyenv install <exact version>

Note that it will install Python from source, so this could take a while. If there is an error, take a look at all required packages that must be installed.

Then, go into the project directory and run pyenv local <exact version>. Now, the Python version should be the correct one if you run python.

Next, we will install Poetry to manage our dependencies. I recommend using pipx (which you can just install using sudo apt install pipx), so pipx install poetry.

Then, we want to make our Poetry environment use the correct version. Most likely, the Python version was installed into: ~/.pyenv/versions/<version>/bin/python, so then you can use (once you are in the backend/src directory):

poetry env use ~/.pyenv/versions/<version>/bin/python

Now, we can run commands in our envrionment using poetry run <command>.

Fin

That was it, with less than 300 lines of instructions can completely set up a Linux server from scratch, in a simple and secure way.

nginx and SSL certificates

When we start the application, everything will simply be accessible from localhost. However, localhost is not accessible outside the server. Instead, we use a so-called "reverse proxy" to allow access to the application. This reverse proxy also ensures we use can TLS (which is so you can have that important https link, which means all communication is encrypted).

nginx configuration

The configuration for nginx is stored in /etc/nginx/sites-available:

server {

	root /var/www/api.dsavdodeka.nl/html;

    index index.html index.htm index.nginx-debian.html;

    server_name api.dsavdodeka.nl;

    location / {
            # First attempt to serve request as file, then
            # as directory, then fall back to displaying a 404.
            proxy_pass http://localhost:4241;
            proxy_http_version 1.1;
            proxy_set_header Upgrade $http_upgrade;
            proxy_set_header Connection 'upgrade';
            proxy_set_header Host $host;
            proxy_cache_bypass $http_upgrade;
    }

    listen [::]:443 ssl ipv6only=on; # managed by Certbot
    listen 443 ssl; # managed by Certbot
    ssl_certificate /etc/letsencrypt/live/api.dsavdodeka.nl/fullchain.pem; # managed by Certbot
    ssl_certificate_key /etc/letsencrypt/live/api.dsavdodeka.nl/privkey.pem; # managed by Certbot
    include /etc/letsencrypt/options-ssl-nginx.conf; # managed by Certbot
    ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem; # managed by Certbot

}

server {
    if ($host = api.dsavdodeka.nl) {
        return 301 https://$host$request_uri;
    } # managed by Certbot


	listen 80;
	listen [::]:80;

    server_name api.dsavdodeka.nl;
    return 404; # managed by Certbot
}

Everything that says "# managed by Certbot" we can ignore. The important part is the location block. It basically says to forward http://localhost:4241, which is the URL of our application, to the outside world at port 80 (and 443 for TLS). The server_name is also important, because this tells nginx to only forward it if these requests come from api.dsavdodeka.nl. The rest is all unchanged from the defaults.

The file in sites-available is not the one actually read, instead it reads from sites-enabled. The latter is actually linked so that it automatically updates based on the former. This can be achieved using a command like:

sudo ln -s /etc/nginx/sites-available/api.dsavdodeka.nl /etc/nginx/sites-enabled/api.dsavdodeka.nl

Certbot (Let's Encrypt)

TLS certificates are necessary to have an encrypted https URL. These are given out by special providers. We use the Cerbot tool to get one from Let's Encrypt. First the site must be available at a normal http link on port 80. Then, after running Certbot (see instructions), it makes sure you can use https and modifies the nginx configuration to make that happen.

Secrets

Passwords

Hetzner account studentenatletiek@av40.nl
root user password of the server
backend user password of the server
Everything in dodekasecrets (Postgres password, Redis password, server secret) and the passphrase to encrypt these secrets
Symmetric encryption key for refresh tokens and asymmetric signing key for access and ID tokens (stored in database)

Database

Upgrade Postgres major version

First, prepare the new image with the new version of Postgres. Then, get both containers to run on the same machine.

For this, deploy using use/repl, ensuring that all the settings are what you want the new database to have. The password can be set at every start, so that doesn't matter. Later, we will change the volume name before running it the normal way.

Next log into the old database, e.g. using docker exec -it -w /dodeka-db d-dodeka-db-1 /bin/bash. Here 'd-dodeka-db-1' is the container name of the old database and '-w /dodeka-db' means we enter the main DB directory.

Then, use pg_dumpall, where '3141' is the local port it's running on and 'dodeka' is the main db user:

pg_dumpall -p 3141 -U dodeka > ./upgrade_dump.sql

Ensure that after this dump is made, no more changes are made to the db, as these will be lost. Best is to shut off any external access (e.g. by shutting down the webserver for a short moment).

Now, you must get the file upgrade_dump.sql to the new database, which you should already have initialized. If you leave the container, you can use docker cp, for example like this:

docker cp d-dodeka-db-1:/dodeka-db/upgrade_dump.sql ./upgrade_dump.sql

Here we again assume the old container is named 'd-dodeka-db-1', the dump file path is '/dodeka-db/upgrade_dump.sql' inside the container and you want to copy it to your local folder. Now, we copy it to the new container (immediately deleting the file locally):

docker cp ./upgrade_dump.sql d-dodeka_repl-db-1:/dodeka-db/upgrade_dump.sql && rm ./upgrade_dump.sql

Notice the other container's name is 'd-dodeka_repl-db-1'. We must now again enter the new database and restore the data.

We do this using the following command (once we have entered with docker exec -it -w /dodeka-db d-dodeka_repl-db-1 /bin/bash):

psql -p 3141 -U dodeka -d postgres -f ./upgrade_dump.sql

Note that this will also restore the passwords as previously set, so login to check if everything is there with the previous password.

Finally, we need to replace the old container's Docker volume by the new one. There are no great solutions for this. First, delete the old volume and recreate it with Docker Compose, like:

docker volume rm d-dodeka-db-volume-production

You want to recreate the volume with Compose, because it will give a warning if done manually. A way to do this simply run deploy.sh on the database, which will automatically create a volume if none existed.

# Copies from your new db's volume to the old one using a temporary container
docker run --rm -it -v d-dodeka_repl-db-volume-production:/from -v d-dodeka-db-volume-production:/to alpine ash -c "cd /from ; cp -av . /to"
# Since our old volume name now contains the new one's data, we can delete the new one and reuse the old one
docker volume rm d-dodeka_repl-db-volume-production

Now, ensure your old deployment uses the new image and restart it. Everything should work then.

Recap

# Update dodeka repo on server

# Deploy repl database
cd use/repl
./repldeploy.sh

# Turn off database access (shut down apiserver)

# Enter database
docker exec -it -w /dodeka-db d-dodeka-db-1 /bin/bash

# Dump all
pg_dumpall -p 3141 -U dodeka > upgrade_dump.sql

# Copy from old database to local
docker cp d-dodeka-db-1:/dodeka-db/upgrade_dump.sql ./upgrade_dump.sql
# Copy from local to repl database
docker cp ./upgrade_dump.sql d-dodeka_repl-db-1:/dodeka-db/upgrade_dump.sql && rm ./upgrade_dump.sql

# Enter repl database
docker exec -it -w /dodeka-db d-dodeka_repl-db-1 /bin/bash
# Restore database
psql -p 3141 -U dodeka -d postgres -f ./upgrade_dump.sql
# Delete dump file
rm ./upgrade_dump.sql

Migrate database schema

First, log in to the production server and enter the dodeka repository.

The schema is then in the backend/src/schema directory.

To load env var, use:

read -s -p $'Enter POSTGRES_PASSWORD:\n' POSTGRES_PASSWORD

Then, be sure you have a GitHub token ready from an account with access to the backend repository, with at least read:org and repo scope (preferably the DodekaComCom account).

Then, from the main repo directory, run:

./use/data_sync/migrate_env.sh

This will prompt you for the token. Paste it in and press enter. A new migrate directory will have appeared, which has been copied from the backend repository (the src/schema package, to be precise).

Now, run alembic:

poetry run alembic revision --autogenerate -m "<Some migration message>"

For troubleshooting, refer to the backend setup docs.

dodekabook