Introduction
Dit is de centrale locatie voor alle informatie over de website en de .ComCom.
Wil je iets weten over het gebruik van de website? Klik hier.
Wil je een suggestie doen voor de website of iets vragen aan de .ComCom (bijvoorbeeld dat we een stukje content aanpassen)? Klik hier.
Wil je iets weten over hoe de .ComCom werkt, of over wat je kunt leren bij de .ComCom? Klik hier.
The rest will be dedicated to explaining technical details, the choices we made, how to setup a development environment, how to deploy to production (and what that even means). We use English for the sake of future compatibility and so that it is easy to share anything written here online (and because many terms don't have Dutch equivalents).
Changelog
Hier houden we alle veranderingen bij. Dit gaat specifiek om code, niet om content. De frontend wordt vaker geüpdate dan hier weergegeven (en heeft ook geen specifieke versies), maar de belangrijke veranderingen zijn wel hier samengevat.
- Frontend (0232c32) - 2024-04-05
- 2.2.1 - 2024-01-10
- 2.2.0 - 2024-01-10
- 2.1.1 - 2023-12-05
- 2.1.0 - 2023-12-05
- 2.0.1 - 2023-10-25
- 2.0.0 - 2023-10-17
- 1.1.0 - 2023-3-18
- 1.0.0 - 2023-1-11
- Pre-1.0.0
Frontend (0232c32) - 2024-04-05
Changed (frontend)
- (February/March) Jesper has updated a lot of content, including most of the commission photos
- (March) Theme has been updated from winter to spring
- (March) Records have been updated
Added (frontend)
- A new "Pas aan" feature has been added for admins to manage the klassementen. This makes use of the 2.2 backend update.
- It allows you to start new classifications
- You can select classifications and modify their start, end and freeze date
- You can make the database recompute the points based on whether or not you want to show the points added after freezing
- Members can see when an event was last added and when it was frozen
- (February) An Oud Leden Dodeka (OLD) page has been added
2.2.1 - 2024-01-10
Tip heeft code aan deze release code bijgedragen.
Released into production on 2024-01-11.
Deploy
- Fix production image version
2.2.0 - 2024-01-10
Note: this version was not released into production.
This is the first full tidploy
release on the backend with the mailserver moved to a dedicated provider. This should make all required features for the classifications fully work.
Senne en Tip hebben code aan deze release bijgedragen. We are now at 591 commits for the backend (including the 150+ commits that were from the deployment repository).
Added (backend)
<server>/admin/class/get_meta/{recent_number}/
(this returns the most recent classifications of each type, recent_number/2 for each type)<server>/admin/class/new/
(creates a new points and training classification)<server>/admin/class/modify/
(modifies a classification)<server>/admin/class/remove/{class_id}
(removes classification with id class_id)<server>/member/get_with_info/{rank_type}/
(Will replace<server>/members/class/get/{rank_type}/
, but added with different name to avoid breaking change, this also returns last_updated and whether the classification is frozen)- The last_updated column is now updated when new events are added.
- (Senne)
<server>/admin/update/training
has been added (not yet in use by frontend) as the start of the training registration update. - New
update_by_unique
store method for updating an entire row based on a simple where condition.
Internal (backend)
- The backend and deployment repository have been merged together, making deployment and versioning a lot easier. We now use a bunch of Nu scripts instead of shell scripts and the deployment will now use
tidploy
. This done in a series of commits from 4602abd to around 414bc23. - Logging has been much improved throughout the backend, so errors can now be properly seen. Error handling has also been improved (especially for the store module). Some common startup errors now have better messages that indicate what's wrong. Documentation has been improved and all database queries now properly catch some specific errors. See #31.
2.1.1 - 2023-12-05
Tip heeft code aan deze release code bijgedragen.
Released into production on 2023-12-05.
Fixed (backend)
- Remove debug eduinstitution value during registration
2.1.0 - 2023-12-05
Note: this version was not released into production.
Matthijs en Tip hebben code aan deze release bijgedragen.
Stats
We are now at 359 commits on the backend and 922 commits on the frontend.
Added (frontend)
- An explanation on the classification has been added to the classification page.
- There is now an NSK Meerkamp role.
Fixed (frontend)
- Last names are no longer all caps on the classification page and are shown in full.
Added (authpage)
- Add Dodeka logo to all pages, using new
Title
component.
Changed (authpage)
- Use flex layout to make alignment better when pressing "forgot password", so no the layout doesn't jump slightly
- Update node version to v20
- Update dependencies
Fixed (authpage)
- Educational institution is now recorded if not changed from default (TU Delft)
Added (backend)
- Admin: Synchronization of the total points per user and events, as well as a more consistent naming scheme for the endpoints. All old endpoints are retained for backwards compatibility. Furthermore, admins can now request additional information about events on a user, event or class basis (see the PR).
<server>/admin/class/sync/
(Force synchronization of the total points per user, without having to add a new event)<server>/admin/class/update/
(Same as previous<server>/admin/ranking/update
, which still exists for backwards compatibility)<server>/admin/class/get/{rank_type}/
(Same as previous<server>/admin/classificaiton/{rank_type}/
, which still exists for backwards compatibility)<server>/admin/class/events/user/{user_id}/
(Get all events for a specific user_id, with the class_id and rank_type as query parameters)<server>/admin/class/events/all//
(Get all events, with the specific class_id and rank_type as query parameters)<server>/admin/class/users/event/{event_id}/
(Get all users for a specific event_id)
- Member: Only renames, as described above.
<server>/members/class/get/{rank_type}/
(Same as previous<server>/members/classificaiton/{rank_type}/
, which still exists for backwards compatibility)<server>/members/profile/
(Same as previous<server>/res/profile
, which still exists for backwards compatibility)
Changed (backend)
- Types: The entire backend now passes mypy's type checker (see the PR)!
- Better context/dependency injection: The previous system was not perfect and it was still not easy to write tests. Lots of improvements have been made, utilizing FastAPI Depends and making it possible to easily wrap a single function call to make the caller testable. See #64, #65, #70 and #71.
- Better logging: Logging had been lackluster while waiting for a better solution. This has now arrived with the adoption of loguru. Logging is now much more nicely formatted and it will be easily possible in the future to collect and show the logs in a central place, although that is not yet implemented. Some of the startup code has also been refactored as part of the logging effort.
- Check for role on router basis: For certain routers, we now check whether they are requested by admins or members for all routes inside the router, making it harder to forget to add a check. The header checking logic has also been refactored and some tests have been added. Much better than the manual
if
check we did before. This also includes some minor refactor and fixes for access token verification. - There are now different router tags, which makes it easier to find all the different API endpoints in the OpenAPI docs view.
Fixed (backend)
- An error is no longer thrown on the backend when a password reset is requested for a user that does not exist.
Internal (backend)
- Live query tests: in the GitHub Actions CI we now actually run some tests against a live database using Actions service containers. This means we can be much more sure that we did not completely break database functionality after passing the tests. PR
- Add request_id to logger using loguru's contextualize
- Added logging to all major user flows (signup, onboard, change email/password), also allowing the display of reset URL's etc. so email doesn't have to be turned on during local development
2.0.1 - 2023-10-25
Tip heeft code aan deze release bijgedragen.
Released into production on 2023-10-25.
Fixed (backend)
- Fix update email: If you requested an email change twice, but only confirmed this after they were both sent, it is no longer to change it twice. After changing it using either one, the other one is invalidated.
- Changed package structure so it is possible to extract the schema package and load it on the production server to run database schema migrations.
2.0.0 - 2023-10-17
Note: this version was not released into production.
Leander, Matthijs en Tip hebben code aan deze release bijgedragen.
Added (backend)
- Admin: Roles, using OAuth scope mechanism, as well as classifications stored in the database, computed based on each event.
<server>/admin/scopes/all/
(Get scopes for all users)<server>/admin/scopes/add/
(Add scope for a user)<server>/admin/scopes/remove/
(Remove scope for a user)<server>/admin/users/ids/
(Get all user ids)<server>/admin/users/names/
(Get all user names, to match for rankings)<server>/admin/ranking/update
(Add an event for classifications)<server>/admin/classification/{rank_type}/
(See current points for a specific classification)
- Member:
<server>/members/classification/{rank_type}/
(See current points for a specific classification, changes hidden after certain point)
Added (frontend)
- Member -> Classification page
- Admin -> Classification page and Add new event
- Roles can be changed in user overview
Changed (backend)
- Major refactor of backend code, which separates auth code from app-specific code
- Updated some major dependencies, including Pydantic to v2
- Database schema update:
classifications
table added to store a classification, which lasts for half a year and can be either "points" or "training".class_events
table added, which stores all events that have been held (borrel, NSK, training, ...). Possibly related to a specific classification.class_event_points
table added, which stores how many points a specific user has received for a specific event. In general, users will have the same amount of points per event, but this flexibility allows us to change that later.class_points
table added, which stores the aggregated total points of a user for a specific classification. When an event is added, this table should be updated using the correct update query.
Changed (postgres)
- Updated from Postgres 14 to 16
Changed (redis)
- Updated from Redis 6.2 to 7.2
1.1.0 - 2023-3-18
Released into production.
Changed (backend)
- Update dependencies, including updating Python to 3.11 and SQLAlchemy 2
Fixed (server)
- Docker conatiner no longer accumulates core dumps, crashing the server after 1-2 weeks
1.0.0 - 2023-1-11
Initial release of the FastAPI backend server, PostgreSQL database and Redis key-value store. Released into production on 2023-1-11.
Added (backend)
- Login: Mostly OAuth 2.1-compliant authentication/authorization system, using the authorization code flow. Authentication is done using OPAQUE:
<server>/oauth/authorize/
(Authorization Endpoint initialize)<server>/oauth/callback/
(Authorization Endpoint after authentication)<server>/oauth/token/
(Token Endpoint)<server>/login/start/
(Start OPAQUE password authentication)<server>/login/finish/
(Finish OPAQUE authentication)
- Registration: Registration/onboarding flow, which requires confirmation of AV`40 signup.
<server>/onboard/signup/
(Initiate signup, board will confirm)<server>/onboard/email/
(Confirm email)<server>/onboard/confirm/
(Board confirms signup)<server>/onboard/register/
(Start OPAQUE registration)<server>/onboard/finish/
(Finish OPAQUE registration and send registration info)
- Update: Some information needs to be updated or changed.
<server>/update/password/reset/
(Initiate password reset)<server>/update/password/start/
(Start OPAQUE set new password)<server>/update/password/finish/
(Finsih OPAQUE set new password)<server>/update/email/send/
(Initiate email change)<server>/update/email/check/
(Finish email change after authentication)<server>/update/delete/url/
(Delete account start, create url)<server>/update/delete/check/
(Confirm deletion after authentication)
- Admin: Get information only with admin account.
<server>/admin/users/
(Get all user data)
- Members: Get information for members.
<server>/members/birthdays/
(Get member birthdays)
Added (authpage)
- Login page
- Registration page
- Other confirmation pages necessary for backend functionality
Added (frontend)
- Profile page
- Leden page -> Verjaardagen
- Admin page -> Ledenoverzicht
- Use React Query for getting data from backend
- Use Context from React for authentication state
- Redirect pages necessary for OAuth
- Confirmation pages for info update
Added (server)
- Docker container that contains the FastAPI backend server, as well as static files for serving the authpage.
Added (postgres)
- Docker conatiner with PostgreSQL server
Added (redis)
- Docker container with Redis server
Pre-1.0.0
The frontend went live in June 2021 and before the release of the backend, was regularly updated using a rolling release schedule. The frontend is not versioned.
Hoe gebruik je de website?
Ben je bestuur, en wil je iets over de adminfunctionaliteiten weten? Klik hier.
Ben je lid van D.S.A.V. Dodeka en wil je meer weten over wat je op de website kan doen? Dan kun je op de volgende pagina's kijken:
Ben je lid van een commissie en wil je weten hoe je de website kunt gebruiken? Dan kun je bij de volgende onderwerpen terecht. Staat wat je wil er niet bij, neem dan contact met ons op. Er is veel mogelijk!
Account aanmaken
Een account aanmaken begint bij de Word lid!-pagina op de website.
Druk op de "Schrijf je in!" knop. Dit opent een scherm waar je wat basisinformatie aan ons kunt doorgeven. Je kunt je alleen inschrijven als je akkoord gaat met het privacybeleid, want we moeten informatie opslaan als je inschrijft bij Dodeka.
Na het invullen van je gegevens, druk je op "Schrijf je in via AV`40". AV`40 is onze moedervereniging. Als je lid wordt bij Dodeka, word je ook lid van AV`40. De officiële ledenadministratie loopt ook via AV`40, dus daarom wordt je doorverwezen naar hun inschrijfpagina.
Het is belangrijk dat je onderaan de keuze "Ik wil lid worden van DSAV Dodeka (de studentenatletiekvereniging van AV'40 Delft)" aanvinkt.
Na het inschrijven heb je waarschijnlijk al een e-mail ontvangen van comcom@dsavdodeka.nl, met de vraag om je e-mail te bevestigen.
Klik op deze button om aan ons systeem te laten weten dat je e-mail klopt. Hierna zal je even moeten wachten, want het bestuur moet jouw aanmelding op onze website koppelen aan de ledenadministratie van AV`40. Zodra zij dit hebben gedaan, zul je bericht krijgen om je officieel te registreren op onze website. Dit zal verstuurd worden naar de e-mail die je hebt opgegeven op onze website (dus niet bij AV`40).
Registreren
In de e-mail vind je een link om je officieel te registreren op onze website. Dit is verplicht als je wil trainen.
Je kunt toestemming verlenen om aan andere leden je verjaardag en leeftijd te laten zien. Wie weet krijg je wel een heleboel felicitiaties!
Na het drukken op "Registreer" word je doorverwezen naar de website, waar je vervolgens kunt inloggen door rechtsboven op het icoontje te drukken.
Leuke pagina's
Klassementen
Records
Commissiemail
We zijn recent overgestapt op een nieuwe provider, https://junda.nl.
Verander je wachtwoord
Via webmail.dsavdodeka.nl kun je het e-mailwachtwoord aanpassen. We raden het sterk aan om het wachtwoord gegeven door de .ComCom aan te passen! Dit doe je bij "Edit Your Settings -> Password & Security". Als je het hebt aangepast, moet je de onderstaande instructies opnieuw uitvoeren (dus zowel bij 'Mail sturen als' en 'E-mail bekijken' moet je het opnieuw instellen).
Stappenplan
Gebruikt je commissie al een @dsavdodeka.nl adres?
Gebruik je het via Gmail?
Vraag het nieuwe wachtwoord op door een appje aan de .ComCom te sturen. Stel het opnieuw in, zie de instructies hier.
Gebruik je het via mail.dsavdodeka.nl?
Vraag het nieuwe wachtwoord op door een appje aan de .ComCom te sturen. Ga dan naar webmail.dsavdodeka.nl, dat is de nieuwe plek waarbij je altijd bij je mail kunt! We raden je aan om het via Gmail te gebruiken, dat werkt over het algemeen wat fijner. Maak hiervoor een Gmail account aan (bijv. dies.dodeka@gmail.com) en volg de instructies hieronder.
Gebruik je nu een Gmail account?
Dan wordt het tijd over te stappen naar een @dsavdodeka.nl! Je kunt wel gewoon dezelfde inbox via Gmail blijven gebruiken.
Open Gmail op je PC of laptop.
- Druk op het tandwieltje (Instellingen) rechtsboven naast je profielplaatje.
- Druk op "Alle instellingen bekijken".
- Kies het tabblad "Accounts en import". Het kan zijn dat dit is veranderd, je wil in ieder geval twee dingen vinden: "Mail sturen als:" en "Email bekijken uit andere accounts" (of iets vergelijkbaars).
Eerst passen we "mail sturen als" aan:
Mail sturen als:
- Druk op "Nog een e-mailadres toevoegen"
- Kies een naam. Vul het correcte @dsavdodeka.nl adres in. Bijvoorbeeld dies@dsavdodeka.nl. Dit moet het adres zijn dat je van de .ComCom hebt gekregen. "Beschouwen als alias" mag aangevinkt blijven.
- Vul voor SMPT-server in:
mail.dsavdodeka.nl
met poort 465. - Gebruikersnaam: opnieuw je e-mailadres, dus bijv. dies@dsavdodeka.nl.
- Wachtwoord: vul het wachtwoord in dat je van de .ComCom hebt gekregen (dit kan anders zijn dan je Gmail wachtwoord)
- Wijzigingen opslaan. (Verder niets veranderen)
- Hierna kun je bij "Mail sturen als" 2 opties zijn. Druk op "als standaard instellen" bij je nieuwe @dsavdodeka.nl adres. (bijv. dies@dsavdodeka.nl).
- Verander (als het niet als zo was) "Bij het beantwoorden van een bericht" naar "Altijd antwoorden vanaf mijn standaardadres (nu dies@dsavdodeka.nl)"
Nu zorgen we dat je ook alles binnenkrijgt op je Gmail.
E-mail bekijken uit andere accounts:
- Druk op "Een e-mailaccount toevoegen"
- Daarna "E-mail importeren uit mijn andere account (POP3)"
- Gebruikersnaam: Je hele e-mailadres (bijv. dies@dsavdodeka.nl)
- Wachtwoord: vul het wachtwoord in dat je van de .ComCom hebt gekregen (dit kan anders zijn dan je Gmail wachtwoord)
- Vul voor POP-server in:
mail.dsavdodeka.nl
met poort 995. - Vink "Een kopie van opgehaalde berichten op de server achterlaten."
- Vink "Altijd een beveiligde verbinding (SSL) gebruiken wanneer e-mailberichten worden opgehaald (dit is belangrijk!)"
- -> Account toevoegen
Test of het werkt door een e-mail naar ander adres te versturen en naar je @dsavdodeka.nl account te versturen. Als het niet werkt, neem contact op met de .ComCom.
Bestuur
Een aantal administratieve taken zijn via de website te regelen. De belangrijkste zijn:
De records zijn nog niet aan te passen via de admintool. Daar zijn we nog mee bezig. Neem contact op om ze handmatig aan te laten passen.
Nieuw lid accepteren
Update klassement
Rollen
Vragen en suggesties
.ComCom
Commissie
De huidige commissie is:
- Matthijs Arnoldus
- Tip ten Brink
- Liam Timmerman
- Jesper van der Marel
- Tijmen Hoedjes (QQ B6)
Geschiedenis
Voormalige leden:
- Laura Geurtsen
- Donne Gerlich (QQ B2)
- Nathan Douenburg
- Aniek Sips (QQ B3)
- Pien Abbink
- Jefry el Bhwash (o.a. QQ B4)
- Senne Drent
- Leander Bindt
- Sanne van Beek (QQ B5)
De commissie is opgericht in bestuursjaar 2 (2020/2021), door Matthijs, Jefry, Laura en Donne. Met Nathan als designer hebben ze in korte tijd een hele mooie website in elkaar gezet, die live ging in 2021. Dat jaar werd ook de eerste 24-uursvergadering gehouden. De website was vooral een bron van informatie en een uithangbord voor de vereniging, en natuurlijk het thuis van de Spike. Daarnaast was de .ComCom verantwoordelijk voor de e-mail.
De website was statisch, wat betekent dat er geen server was die data kon opslaan en dat je dus ook niet in kon loggen. Tip werd lid van Dodeka en ook meteen lid van de .ComCom in 2021/2022 en ging hard aan de slag om een "backend" te bouwen, waarmee wel ingelogd zou kunnen worden. Ook Pien werd lid en hielp mee met het bouwen van nog meer mooie pagina's.
Dit project ging live in januari 2023. Dat jaar krijgen we ook een nieuw lid, Leander, die nieuwe functies maakt voor de backend. Ondertussen waren Matthijs en Pien hard bezig om de content te onderhouden, iets wat een steeds grotere taak werd.
2023/2024 zag veel leden komen en gaan, terwijl de content beter dan ooit werd onderhouden door o.a. Jesper. Achter de schermen werd gewerkt aan de trainingsinschrijving, maar er ging bijvoorbeeld ook een systeem voor de klassementen live. Jefry, lid van het eerste uur, nam helaas afscheid. Met zijn opgedane kennis lukte het hem om voorzitter te worden van de Lustrumcommissie.
Wat kun je doen leren?
Bij de .ComCom is enorm veel te doen en dus natuurlijk ook heel veel te leren. De taken kunnen in de volgende vier belangrijke delen worden opgesplitst:
- Design: de website moet er natuurlijk mooi uitzien. Dat is echter niet het enige, want de website moet ook gebruiksvriendelijk zijn. De user interface (UI), dus hoe de gebruiker omgaat met de website, en de user experience (UX), dus de ervaring van de gebruiker, moeten ook top zijn. UI/UX design is dus misschien nog wel belangrijker en daar is dus enorm veel over te leren.
- Content: de website bevat naast nieuws belangrijke informatie over o.a. hoe je lid wordt en wanneer de trainingstijden zijn, ook bijvoorbeeld informatie over huidige en toekomstige wedstrijden en activiteiten. Die moeten constant geüpdate worden. De website is een uithangbord voor de vereniging en speelt een belangrijke rol bij het aantrekken van nieuwe leden. Dit maakt de "content" (de tekst en plaatjes) erg belangrijk. Je zult expert worden in het doorspitten van de FOCUS-archieven en het schrijven van tekstjes.
- Programmeren: een groot deel van het werk dat bij de .ComCom wordt gedaan, is inderdaad simpelweg programmeren. Het is wel best anders dan je misschien gewend bent bij de vakken die je op de uni volgt. Het zijn namelijk geen Python plotjes of algoritmes in Java. Het gaat hier om het bouwen en onderhouden van een grote applicatie (die ondertussen 3+ jaar oud is) met meerdere componenten. Hieronder geven we nog meer detail.
- Systeemadministratie: dit is niet het meest sexy taakje, maar wel heel belangrijk. We hebben nu een inlogsysteem, dus we bewaren o.a. e-mailadressen, geboortedata en andere privéinformatie. Daarnaast mogen de systemen niet zomaar uitvallen en moeten we de juiste keuze maken tussen aanbieders van servers.
Programmeren
Programmeren, coderen, developen, er zijn veel woorden voor wat wij doen. Je kunt het zelfs software engineering of architecture noemen. De volgende dingen bouwen wij allemaal:
- Een website, de "frontend", (JavaScript, HTML, CSS) met het React framework. Naast statische pagina's komen er ondertussen steeds meer dynamische pagina's bij, die up-to-date informatie ophalen van een server, die vervolgens moet worden gemanipuleerd. Daarnaast moet worden bijgehouden of je wel of niet bent ingelogd en welke onderdelen van de website je kunt zien. Het bestuur moet een tabel kunnen zien met informatie over alle leden, en dit aan kunnen passen. Als data verandert op de server, moet je dat ook meteen kunnen zien.
- Een server die een toegankelijk is via een API, de "backend" (geschreven in Python met FastAPI), die reageert op aanvragen (requests) van de website. Deze slaat alle wachtwoorden op in een vorm die wij niet kunnen lezen en kan bewijzen of je echt bent wie je zegt dat je bent. Dit vergt cryptografie. Daarnaast moet de backend informatie op kunnen halen uit een database. Hiervoor gebruiken we SQL. Hoe we inloggen en checken of iemand toegang heeft (authenticatie en autorisatie) volgt een bepaalde standaard, genaamd OAuth 2.
Systeemadministratie
- Scriptjes en andere tools om alles te kunnen runnen, zoals Dockerfiles en Docker Compose bestanden die beschrijven hoe we de databases (PostgreSQL en Redis) opstarten.
- Beheren van de emailprovider (nu runnen we dat zelf op onze server).
- Beheren van de server zelf (Ubuntu Linux), updates uitvoeren, de toegang veilighouden.
Overview
The technical information is divided into four parts:
- Setup: which only teaches how to get everything running on your local machine, so you can dive into the code and start developing right away.
- Architecture: a deeper look at the architecture of the entire application, detailing why we made certain decisions and explaining things on a higher-level than you would find by looking at the comments and documentation in the source code.
- Developing: details what kind of things you need to modify if you want to make changes. Contains tips on how to easily do actualy 'development', what files matter most, among other things.
- Deployment: once you have developed something, it needs to actually go live. This section details all the steps you have to go through to deploy the code into the real world, how to administer the servers and related tasks.
Definitely look at Setup, and also at Deployment if you actually do an update. Otherwise, Developing should be what you look at next. Architecture is only necessary when you want to make big changes or understand things better.
Setup
Here you can find information on how to set everything up, both the frontend, backend and database, so you can start developing right away.
No matter what you do, you'll need to install Git, so check out the instructions for that.
If you are only doing things on the frontend, all you need to know is how to setup the frontend.
If you are developing the backend, you will probably want to also test things on the frontend, but you will definitely need to setup first the databases locally using a tool called Docker, and then you can setup the backend application itself.
Git
To share code with each other, we use a program called Git. We use Git to download the "online" version of the source code (which is hosted on GitHub) so that we can work on it locally on our own machines. We also use Git to again upload it to the server.
For an in-depth guide to Git, checkout the Pro Git book. For an overview of the most important commands, checkout GitLab's cheat sheet and this other one. Want to perform some specific action? Check out Git Flight Rules. Also, ChatGPT is pretty good at Git nowadays.
You can also use a GUI client instead of the command line (although I nowadays recommend against that), such as the one integrated into Visual Studio Code (with an extension such as GitLens), GitKraken or GitHub Desktop.
But first, you'll need to install Git itself. On Windows, use the link at the top here (the 64-bit standalone installer). On Linux, use your system package manager (instructions here). On macOS, use HomeBrew.
For the Windows installer, I recommend against adding the links to your context menu (so disable Windows Explorer integration). Git Bash Profile for Windows Terminal can be useful (if you've installed it). As default editor I recommend something like Notepad++ or Notepad. For the rest the default options should be okay. Be sure to use the recommended option for "Adjusting your PATH environment".
GUI
If you're using a GUI program (so GitKraken, VS Code or GitHub Desktop), use their documentation to login to GitHub (make sure you have an account). Go to the frontend or backend for further instructions.
Command line (recommended)
Why do I recommend using the command line? Because a lot of developer tools work with it exclusively. Because it's simple to develop one, they usually have the most features and offer you the most control. At the same time, they also usually allow you to make more mistakes and can have a steeper learning curve, although ChatGPT has made things a lot easier nowadays.
To be able to download and upload repositories, you'll still need to login to GitHub. For that, I recommend using the gh
CLI. For Windows, I recommend just using the installer. The current version (as of December 2024), can be found here. To find newer versions, go to the releases page and download what's most similar to "GitHub CLI 2.63.0 windows amd64 installer" (amd64 means x86-64, if you're using an ARM chip you should install using WinGet, see here). For general installation instructions, see here.
If you haven't already, I recommend installing "Windows Terminal", which is much, much better than the standard Command Prompt. You can find it on the Microsoft Store (if the link doesn't work, just search for Windows Terminal). You might also want to install PowerShell 7.
Once you have gh
installed (you might need to restart your terminal), run gh auth login
and follow the steps to login to your GitHub account (make sure you have one!). Finally, go to the frontend or backend for further instructions.
Frontend setup
For more information on developing the frontend, see the section on developing the frontend.
Setting up the frontend is the easiest. The frontend is entirely developed and deployed from the DSAV-Dodeka/DSAV-Dodeka.github.io
repository. Look at the instructions for Git if you haven't already.
Steps
- Open the command line (PowerShell/Windows Terminal/Terminal) in a folder of your choice, where you want to install the code to
- Run
git clone https://github.com/DSAV-Dodeka/DSAV-Dodeka.github.io.git dodekafrontend --filter=blob:none
, this creates a folder called "dodekafrontend" with all the code - Install a code editor (also called an IDE, Integrated Development Environment), VS Code is recommended
- Install
npm
from NodeJS, this is used to install and run the website locally.
Windows-specific
- Make sure WinGet is installed (run
winget
in the terminal and see if it shows an error). See instructions below if it is not installed. - Run the commands from the the NodeJS website to install npm/node
- TODO finish
Other
Steps, explained in detail
Download the code
The first step is to "clone" (download) the repository to your computer. Because we store all our images inside the repository, the full history contains a lot of large images. In the past, we didn't properly optimize them so sometimes there were multiple versions of very large images. Thankfully, you don't have to download the full history. Instead, when cloning, run the following command (run it in some folder where you want to store it, the result will be a folder called 'dodekafrontend'):
git clone https://github.com/DSAV-Dodeka/DSAV-Dodeka.github.io.git dodekafrontend --filter=blob:none
The --filter=blob:none
option executes a "partial clone", in which all blobs (so the actual file contents) of old commits are not downloaded. They are only downloaded once you actually switch to a commit. This might cause some occasional issues with GUI clients, so check out a branch first with the command line (git checkout <branch name>
).
Node/npm
The next step is to install npm
, which is the standard package manager for JavaScript. We use it to download all our dependencies. npm
is included when you install NodeJS, which is a JavaScript runtime (a program that runs JavaScript code) based on the same internal engine as Google Chrome. We use that runtime to develop and build our project.
Download and install it from the NodeJS website. I recommend picking an LTS version (so v22 right now). v20 is also fine if you still have that.
Note, if you don't care about the installation taking a while or are not comfortable with the command line, you can use the installer instead. Otherwise, I highly recommend (if you're on Windows), to use the 'fnm' package manager option. To use the instructions, you should use a PowerShell terminal (best to use from Windows Terminal, download it from the Microsoft Store). You also need to install WinGet, which you can also get from the Microsoft Store through the App Installer (it's probably already installed, be sure to NOT download "APK Installer" or other non-Microsoft apps! See if it exists in the Microsoft Store, otherwise you can ignore the step. See here for details about the App Installer and here for WinGet).
If you're not on Windows, I recommend using the nvm
option from the NodeJS website. Otherwise, you could also use pnpm
instead.
Next, open the command line in the root folder of the project. This can be easily done by opening it in a IDE (integrated development environment, I recommend using VS Code) and then opening a terminal there. Then, to install all dependencies, run:
npm install
Running the website
Once this is done, we can actually run the website using the command:
npm run dev
Under the hood, this will use Vite to bundle and build our project into actual HTML, CSS and JavaScript that our browser can run. The command will start a dev server, which will allow you to access the website in your browser using something like localhost:3000
(the port, 3000 in this case, might be different).
Docker/container setup
We use the deploy
folder in the dodeka
repository for the setup of the relational SQL database and key-value store (a special database that is not relational, it's basically a big dictionary/map). For this, we use a technology called containers. Specifically, we use Docker.
In order to run the scripts, there are a few requirements. If you're on macOS or Linux, you only need to install Docker Engine as both are Unix-like systems. But you can also install Docker Desktop.
First of all, you need to have a Unix-like command line with a bash-compatible shell: i.e. Linux or macOS. See the next section for instructions on Windows Subsystem for Linux (WSL), which allows you to install Linux inside Windows.
Then, you need a number of tools installed, which you can install from the links below if you're not on Windows:
If you're on Windows, installing Docker Desktop after you've installed WSL will make these available inside WSL if Docker Desktop is running.
WSL
If you're on Windows, the OS has simply too much differences to be able to run Linux containers directly. It therefore needs an additional virtualization layer. Thankfully, there now exists a technology called WSL (or Windows Subsystem for Linux).
You can find installation instructions here. I recommend installing either Ubuntu or Debian (Ubuntu is based on Debian, and all our containers run on Debian), which are two 'flavors/distributions' of Linux.
For a better experience with the command line, I recommend getting the Terminal application (not the built-in Command Prompt), which you can install from the Microsoft Store.
Local development
To be able to run everything, you need to have configured access to the containers. To do that, run:
docker login ghcr.io
Enter your GitHub username. For the password, don't use your GitHub password, but a Personal Access Token (Settings -> Developer settings) with at least repo, workflow, read:packages, write:packages, read:org and write:org permissions. Be sure to save the token somewhere safe, you'll probably have to reuse it and you can't view it in GitHub after creation!
Now, you will need to be able to access the scripts in this repository. If you're using Windows, do not copy the files from Windows to Linux, this leads to some weird formatting problems in the scripts that cause them to fail. Instead, clone this repository directly from WSL, by running:
git clone https://github.com/DSAV-Dodeka/dodeka.git
You will again need to enter your GitHub username and the Personal Access Token.
You will now have a dodeka
folder containing all the necessary folders.
Now, we will use Docker Compose to start everything we want for development:
NOTE: Instead of the below commands, take a look at the shortcuts section if you're okay with installing an extra program (Nushell). Also if you're on WSL, please look at the last paragraph of the current section (use dev_port.env
instead of dev.env
).
First, we pull:
docker compose -f use/dev/docker-compose.yml --env-file use/dev/dev.env --profile data pull
We start:
docker compose -f use/dev/docker-compose.yml --env-file use/dev/dev.env --profile data up -d
We shutdown:
docker compose -f use/dev/docker-compose.yml --env-file use/dev/dev.env --profile data down
(To understand this command, we are basically indicating three options. -f
is which compose file we are running, --env-file
is which environment variables we are setting and --profile
which services. We choose data
because we only want to run the databases.)
Note that they will only be accessible on localhost from the environment they run in. So if you are using WSL, they are only available from localhost. So if the backend project is stored in Windows, you need to change the --env-file
option to use/dev/dev_port.env
. This will open the databases on the 0.0.0.0 host, which will make them accessible even from outside WSL.
Shortcuts
Those commands are quite long and hard to remember. To make things easier, we use Nu shell scripts to provide shorter commands that are easier to remember. First, follow the instructions on installing nu below.
On Windows, you need to prepend nu
to every command. For convenience, the commands below all contain nu
. However, on Linux/macOS this is not necessary, you can just run the script directly once you've installed Nushell (because they recognize shebangs).
The scripts are all in the root directory, but you can call them using ../
if you are in a subdirectory and they'll still work. Make sure you've installed Nu.
Running the development databases
Start (this will also pull the images, so make sure you're logged in with docker login ghcr.io
):
Note: this opens the ports on host 0.0.0.0, so only do this when your Docker doesn't run on your main OS, so use this if it e.g. runs inside WSL
↓ the extra 'p' is on purpose
nu dev.nu upp
Stop:
nu dev.nu down
If running Docker on your main OS:
nu dev.nu up
Testing and checking the backend
(You can only run this after having followed the steps in the backend setup).
nu test.nu backend
Other commands can be found in the various .nu
files in the root directory.
Installing Nushell
Install nu
(you don't have to set it as your default shell, just make sure you have nu
on your path).
If you have Rust installed, you can install it using cargo install nu
. This is especially recommended if you're on Linux (to install it from a binary (which is much faster!), first run cargo install binstall
and then cargo binstall nu
). On macOS, install it using Homebrew (brew install nushell
).
Why Nushell? In short, because it is a modern shell scripting language that lets you very easily call external programs. Its syntax is also readable by people who haven't used it before, but still quite powerful. It also can replace all kinds of different tools we might need otherwise.
Test database
Syncing the test database
A number of test databases are stored inside the DSAV-Dodeka/backend
repository. Running the commands above creates an empty database. To populate it with the latest test values, run:
poetry run python -c "from use.data_sync.cli import run; run()"
You will probably need to set the GHMOTEQLYNC_DODEKA_GH_TOKEN as an environment variable for access. The safest way to set this is to add it to a file like sync.env
:
export GHMOTEQLYNC_DODEKA_GH_TOKEN="GitHub Personal Access Token"
Here you should replace "GitHub Personal Access Token" with the value of your token, which will need repo
scope. You can then run . sync.env
before running the script to sync the database.
Simple backup
Ensure you are in the main dodeka
directory, not in a subfolder.
To create a backup, run:
poetry run psqlsync --config data/test.toml --action backup
If the backup is running with a password, use the --prompt-pass
option. You can then type/copy-paste the database password.
poetry run psqlsync --config data/test.toml --action backup --prompt-pass
From your own computer, you can then use ssh copy to transfer the file.
scp backend@<ip address>:/home/backend/dodeka/data/backups/backup-20230430-154532-dodeka.dump.gz <destination>
Backend setup
NOTE: the backend is now developed from the dodeka
repository, in the backend
subdirectory!
Before you can run the backend locally, you must have a Postgres and Redis database setup. Take a look at the database setup page for that.
Run all commands from the dodeka/backend
folder!
- Install
uv
. uv is like npm, but then for Python. I recommend using the standalone installer. - Then, set up a Python environment. Use
uv python install
inside the./backend
directory, which should then install the right version of Python, or use one you already have installed. - Next, sync the project using
uv sync --group dev
(to also install dev dependencies). This will also set up a virtual environment at./.venv
. - Then I recommend connecting your IDE to the environment. In the previous step
uv
will have created a virtual environment in a .venv directory. Point your IDE to that executable (the file namedpython
orpython.exe
in.venv/bin
) to make it work. - Currently, the
apiserver
package is in a /src folder which is nice for test isolation, but it might confuse your IDE (if you use PyCharm). In that case, find something like 'project structure' configuration and set the /src folder as a 'sources folder' (or similar, might be different in your IDE). - You might want some different configuration options. Maybe, you want to test sending emails or have the database connected somewhere else. In that case, you probably want to edit your
devenv.toml
, which contains important config options. However, this means that when you push your changes to Git, everyone else will get your version. If their are secret values included, those will be publicly available on Git as well! Instead, create a copy ofdevenv.toml
calleddevenv.toml.local
and make changes there. Now, Git will know to ignore this file. - Now you can run the server either by just running the
dev.py
in src/apiserver (easiest if you use PyCharm) or by runninguv run backend
(easiest if you use VS Code). The server will automatically reload if you change any files. It will tell you at which address you can access it.
Running for the first time
If you are running the server for the first time and/or the database is empty, be sure to set RECREATE="yes" in the env config file (i.e. devenv.toml.local
). Be sure to set it back to "no" after doing this once, or otherwise it recreates it each time.
Architecture
This section describes all the different components that make up the website and how they interact. It describes why we made certain choices, which tools and frameworks we use, and more.
Frontend vs Backend
What is a "frontend", what is a "backend", why do we have this separation?
First, what is the difference? There is no perfect definition, but in general the "frontend" is the part that is exposed to the end user, so what they actually see and interact with. So the "frontend" is about the user, it's what actually runs in the browser.
The "backend" is the part that cannot run in the browser, for example because it needs to have dynamic access to the database. It runs on a server, away from the user. It's job is to store things that need to be secure not visible to everyone, like passwords or personal information. For that, it needs a database.
To use the backend, it exposes a so-called "API", which is basically a list of functions which can be called from the internet. In particularly, it mostly adheres to the principles of a RESTful API (there are many resources on the internet about it).
In general, when the frontend wants data, it sends a JSON request (a specific format used for structuring data) ... #TODO
There are two main reasons. The first, more technical one, is that by separating the frontend from the backend, you can develop them separately. So someone who wants to update how something looks, doesn't have to worry about any logic that should happen on the server. This allows teams to work more independently. It's also a "separation of concerns", which is ensures that there's not too much tangling of functionality. It also allows fully replacing one of the two components without worrying about the other. This might be useful in the future.
However, the more important ... #TODO
Frontend
AuthContext
Context is a way to share values throughout a React application without having to explicitly pass a prop throughout the whole tree. According to the official guide on context, 'the current authenticated user' is a good use-case.
The 'context' is a class that is very similar to a global useState.
The createContext
function from React creates the AuthContext
object using a default value. Since we do not yet want to set its value, we define AuthUse
(a type) containing authState
and setAuthState
and use an empty version of it as the default.
... see the code
The authState
attribute of the AuthContext is an AuthState
object, which we defined ourselves. This is a simple object, containing some basic information on the user and authentication status. We try to keep the objects immutable as much as possible.
We export the AuthContext.Provider
, which is the component that will wrap our entire app, allowing each subcomponent to access the context.
We only want to set the default value once, once the application starts. Furthermore, the initialization requires asynchronous calls that can only be made at runtime. This can be tricky, so to prevent any problems we use the useEffect
hook to initialize everything on the first render. The AuthProvider is initially initialized with an empty AuthState, which is then populated asynchronously.
The initialization uses our custom useAuth
function, which contains most of the logic. We should write tests for this.
There are 3 tokens, ID token (from OpenID Connect, not actually used for authorization), access token (used for all authorized requests) and refresh token (used to refresh access and ID token). The ID token is transparent to the front end, meaning it is guaranteed we can read its data. It is used to see the username and other useful profile information to personalize the website. The "expiry" returned by a token request relates to the expiry time of the ID token (10 hours, 3600 s).
The useAuth
function will check if it has stored tokens and parse the ID token value to populate the user
attribute of the context's authState. If the token is expired, it will automatically request a new one using the refresh token. If it is missing, it simply assumes the user is not logged in.
Database
Everything is deployed from the dodeka
repository. It also contains the "source" for the database (DB, PostgreSQL) and key-value store (KV, Redis).
The most important file is config.toml
, which contains all practical configuration. In the build
-folder you can find the source for all deploy scripts (build/deploy
) and container build files (build/container
). Using the confspawn
tool the actual scripts are built from these templates. The results you can find in the various folders in the use
directory.
Note on confspawn
The total configuration was spread over a lot of different files in different places (PostgreSQL config, various Docker Compose files...). Some configuration is also very project-dependent (names like 'dodeka'). To keep things more generable and have one single source of truth for the configuration, Tip developed a Python tool called confspawn
which can take in a configuration "template" (where certain values are not filled in yet) and fill them in using a secondary configuration source. It uses Jinja2 templates for this.
Backend Server and authpage
Backend framework (Server): Python FastAPI server running on uvicorn (managed by gunicorn in production), which uses uvloop as its async event loop.
Frontend framework (authpage): React, built using Vite as a multi-page app (MPA) and served statically by FastAPI.
Persistent database (DB): PostgreSQL relational database.
In-memory key-value store (KV): Redis.
We use the async engine of SQLAlchemy (only Core, no ORM, we write SQL manually) as a frontend for asyncpg for all DB operations. Alembic is used as a migration tool.
The async component of the redis-py library is used as our KV client.
This is an authorization server, authentication server and web app backend API in one. This model is not recommended for large-scale setups but works well for our purposes. It has been designed in a way that makes the components easy to separate.
Client authentication uses the OPAQUE protocol (password-authentication key exchange), which protects against agains pre-computation attacks upon server compromise, as well as not relying on PKI (public key infrastructure) for protecting the password when it is sent over the network. This makes passwords extra safe in a way that they never leave the client.
Authorization is performed using OAuth 2.1, with much of OpenID Connect Core 1.0 also implemented to make it appropriate for authenticating users. While not technically required, OAuth tokens are generally in the form of JSON Web Tokens (JWTs) and OpenID Connect does require it, so we use them here. Good 3rd-party resources can be found for OAuth and JWTs.
In addition to this, we rely heavily on the following libraries:
- PyJWT for signing and parsing JSON web tokens.
- cryptography for many cryptographic primitives, primarily for encrypting refresh tokens and handling the keys used for signing the JWTs.
- pydantic for modeling and parsing all data throughout the application.
The backend relies on some basic cryptography (see the cryptography section). It is nice to know something about secret key cryptography (AES), public key cryptography and hashing.
Why did we choose <x>?
FastAPI
FastAPI was selected because of its modern features reliant on Python typing, which greatly simplify development. FastAPI is built on Starlette, a lightweight async web server framework. We wanted a lightweight framework that is not too opinionated, as we wanted full control over as many components as possible. Flask would have been another option, but the heavy integration of typing in FastAPI made us choose it instead. Of course, there are also many other options outside the Python ecosystem. We chose to use Python simply because it is very well-known among university students.
Redis and PostgreSQL
PostgreSQL and Redis were selected simply by their popularity and open-source status. They have the most libraries built for them, have a large feature set and are widely supported. We chose a relational database because we do not need massive scaling and having relational constraints simplifies keeping all data in sync. For Redis, we use the RedisJSON extension module to greatly simplify temporarily storing dictionary-like datastructures for storing state across requests. Since there are a great many specific data types that need to be persisted, and they do not have any interdependency, this is much easier to do in an unstructured key-value store like Redis. It is also much faster than having to do this all in a structured, relational database. Note that all DB and KV accesses are heavily abstracted, the underlying queries could easily be re-implemented in other database systems if necessary.
We went all-in on async, expecting database and IO calls to make up the majority of response times. Using async, other waiting requests can be handled in the mean-time.
OAuth
Implementing good authentication/authorization for a website is hard. There are many mistakes to be made. However, many available libraries are very opinionated and hard to hook in to. Furthermore, the options become qutie limited when there is approximately no budget. There are some self-hosted solutions, but getting the configuration right can be very tricky and none were found that served our needs. As a result, we went for our own solution, but built using well-regarded web standards to ensure there are no security holes. OAuth is used by every major website nowadays, so the choice was easy.
OPAQUE
OPAQUE is an in-development protocol that seeks to provide a permanent solution to the question of how to best store passwords and authenticate users using them. A simple hash-based solution would have been good enough, but there are many (good and bad) ways to implement this, while OPAQUE makes it much more straightforward to implement it the right way. It also provides tangible security benefits. It has also been used by big companies (for example by WhatsApp for their end-to-end encrypted backups), so it is mature enough for production use.
Our implementation relies on opaque-ke, a library written in Rust. As there is no Python library, a simple wrapper for the Rust library, opquepy, was written for this project. It exposes the necessary functions for using OPAQUE and consists of very little code, making it easy to maintain. The wrapper is part of a library that also includes a WebAssembly wrapper, which allows it to be called from JavaScript in the browser.
Maybe implement in future
- https://datatracker.ietf.org/doc/html/rfc8959 secret-token
- https://datatracker.ietf.org/doc/html/rfc7009 token revocation request
- https://auth0.com/docs/secure/tokens/json-web-tokens/json-web-key-sets JSON web key sets
- https://datatracker.ietf.org/doc/html/rfc8414 OAuth 2 discovery
- https://www.rfc-editor.org/rfc/rfc9068 Access token standard (also proper OpenID scope)
- https://datatracker.ietf.org/doc/html/rfc7662 token metadata (introspection)
Performance
What is fast:
- The actual HTTP server, which runs on uvloop. This won't ever be a likely bottleneck
- The direct interface with the database: asyncpg is one of the fastest PostgreSQL adapters around
What is slow:
- The parsing and loading of database data into Python (parsing into Pydantic models)
- Manipulation of database data in Python
- If we return a type directly, meaning FastAPI has to do additional conversion. Using JSONResponse directly is much faster
Most of the latter isn't a problem for simple responses that don't work on many rows. But if many rows are included, it might be worth it to write a parse function for a specific model and return a JSONResponse directly.
Cryptography
It's easy to make mistakes when you manually use cryptographic primitives. This project primarily uses cryptography for the purpose of signing and encrypting its tokens. If this is done incorrectly, the project is entirely insecure, because with forged tokens almost all data can be easily queried. Therefore, this document aims to properly document how cryptography is used in this project. Cryptography is also used for storing passwords, but this is almost entirely handled by a separate library, mostly using default settings.
Refresh tokens are only readable by the authorization/authentication part of the server and therefore can use the more secure and faster symmetric encryption. Since refresh tokens are not very standardized, this part is the most 'custom' and is the only part that uses cryptographic primitives. All these operations can be found in the auth/hazmat
package. It's named hazmat
because it uses the hazmat
module of the Python library pyca/cryptography
and to also signify that this code should be checked thoroughly. The pyca/cryptography
library relies on OpenSSL (the underlying crypto implementation) binaries that are packaged together with the Python library. It is important to frequently update pyca/cryptography
as OpenSSL receives frequent security fixes.
Refresh tokens
For our refresh tokens, we use AES encryption, specifically 256-bit encryption in GCM mode. If used correctly, GCM is one of the securer modes. However, if you misuse nonces/iv (random bytes used for every encryption to ensure the same plaintext looks different each time, ensuring no information leaks), some information could leak, such as the plaintext length. Care must therefore still be taken.
Our refresh tokens are simple, consisting only of a unique id, a family id and a random tag that makes it unique among its family. A 'family' is a set of refresh tokens descended from a single authentication. Therefore, we encode them as simple Python dicts (using pydantic
) and our AES encryption thus works only on these dictionaries. crypt_dict.py
provides the encryption and decryption for this.
We encode the dicts as JSON (as plaintext utf-8), generate a random 12-byte nonce (as recommended for AES-GCM) using the Python secrets.token_bytes
function (which is recommended for such cryptographic use). We don't use any associated data (which would be unencrypted but could not be modified) as refresh tokens can exist in only a single context, so we simply encrypt using our initialized AESGCM
object. This object must already contain the private symmetric key, which we assume is 256-bits (but technically could be also 128 and 192-bits). We simply concatenate the nonce and the encrypted data (which contains an authentication tag added by the pyca/cryptography
, which ensures integrity of the data) and encode this in a string using base64url
. Note that as it is not necessary, we do not add any base64 padding in the encoding step.
Our decryption works exactly the same, decoding it first, taking the nonce as the first twelve bytes, the data+tag what comes after. It then decrypts using an initialized AESGCM
object and decodes the JSON into a dict. A lot can go wrong, but since all refresh tokens should have been generated by this application, we only provide an opaque 'DecryptError'. It is important to note that the base64url
decoding simply ignores characters outside the alphabet. Furthermore, since it works with both padding and no padding there are multiple "encodings" of a single refresh token. The encoding holds no semantic value, so only do logic on the fully decrypted and decoded refresh token!
Access tokens (and id tokens)
Access tokens use asymmetric encryption and are JSON web tokens (JWTs). They come signed with a number of claims, meaning the resource server (which in this application is currently the same server as the auth server, but the code is mostly decoupled) simply has to check the validity of the token using a freely available public key. Anyone could check the validity of the claims inside the JWT.
We use EdDSA as our algorithm using the Ed448 curve. The latter technically offers better security than the slightly more standard Ed25519, but the difference is small. It was just a choice. EdDSA is used over other algorithms for its greater compactness. Note that this algorithm is not resistant to hypothetical advanced quantum computers, although we are very far from any quantum computer with enough power to break it. Note that AES is resistant to quantum computers.
To implement signing and verifying, we use the PyJWT
library, which internally also relies on pyca/cryptography
(and therefore on OpenSSL) for its cryptography. sign_dict.py
takes care of signing the token passed as a dict. Since the authorization server would never have to verify an access token, we implement that inside our application (apiserver/lib/hazmat/tokens.py
), not inside the decoupled auth component.
Passwords (OPAQUE)
We store passwords using the OPAQUE protocol (see README). This library uses some asymmetric encryption (and other smart stuff) so we can have our cake and eat it too. We don't handle any password on the server, and we are also protected against pre-computation attacks (where using the provided salt an attacker precomputes a password dictionary before taking control of the server). See the opaquepy
library (maintained privately by Tip) for details.
Keys
We use an asymmetric key (a public verification key and a private signing key), as well as a symmetric (private) key. Our symmetric key is simply encoded using base64url, while our public key (due to requirements of PyJWT) uses a more complicated scheme, namely in PEM format, using PKCS#8 for the private key and X509PKCS#1 for the public key. These are standardized schemes. We wrap them in our own structs to make handling easier and, they all include a kid
(key id), to make it possible to store multiple in a database.
Auth
Specification
We use the Authorization Code Flow according to RFC6749 Section 4.1, as recommended by Internet Draft OAuth 2.0 for Browser-Based Apps, since the frontend application can be recognized as a Javascript Application without a Backend (Section 6.3 of the latter document). This might be confusing as we do have a backend, but backend in that context means a frontend backend that dynamically serves HTML, while our frontend is hosted as static HTML on GitHub Pages.
We comply fully with OAuth 2.1 and implement an Authorization Code Flow with PKCE. We comply as much as possible with OpenID Connect, except on some points that are only for interoperability (like supporting certain algorithms), which we do not require. Our compliance with OpenID Connect is not as rigorously checked, as it is a more complex standard. We used it more as a guide.
Useful resources
- https://auth0.com/docs/security/tokens/refresh-tokens/refresh-token-rotation
- Other pages from https://auth0.com/docs
- https://www.ietf.org/archive/id/draft-ietf-oauth-v2-1-04.html
- https://openid.net/specs/openid-connect-core-1_0.html
- https://www.oauth.com/
Protocol
Resource Owner - end-user
Resource Server - dodekabackend=apiserver (Server)
- identifier: dodekabackend_client
Client - dodekaweb (Pages)
- identifier: dodekaweb_client
- redirect_uri: .../auth/callback
Authorization Server (AS) = OpenID Provider (OP) - dodekabackend=apiserver (Server)
The Relying Party (RP) in the context of OpenID Connect is the Client.
- a
Create an Authorization Request (Section 4.1.1)
- (Client) AuthRedirect.tsx generates the AuthRequest:
{
"response_type": "code",
"client_id": "dodekaweb_client",
"redirect_uri": ".../auth/callback",
"state": "a) STATE",
"code_challenge": "b) CHALLENGE",
"code_challenge_method": "S256",
"nonce": "c) NONCE"
}
- a) STATE: The state is a randomly generated string that is used to later check the response
- b) CHALLENGE: First a code verifier is generated, which is cryptographically random. A SHA256 hash is then computed (as indicated by 'S256'). The verifier is stored locally. Used as a check by the server.
- c) NONCE (OpenID): Randomly generated used to later verify the OpenID ID token. The random value is stored, the hash is sent.
- (Client) The AuthRequest is encoded as an urlencoded param string, not JSON. The user is redirected to the AS with those params, specifically to ../oauth/authorize
- (AS) The Authorization Server (AS) validates the AuthRequest, in particular the response type (only "code"), the client_id, the redirect_uri and the format of the state, challenge and nonce. It generates a random identifier (the Flow ID). It uses this identifier to persist the process on the server by the storing the entire AuthRequest as a JSON. This is needed in order to later check everything.
- (AS) The user is redirected to a new page. In a perfect world, this would be a page served by the server, but in this case this is a page on the Client (../auth/credentials). The Flow ID is sent as a query parameter.
- (AS/Client) The next step does not fall under the OAuth protocol, as any AS is free to choose its own authentication implementation. In this case we use the OPAQUE protocol. A user supplies their username and password, which is then used with the ../auth/login/start and ../auth/login/finish endpoints to ensure the password is correct.
- (AS) In the final ../auth/login/finish step, the user also supplies the Flow ID. The authorization code (OPAQUE "session key") which is computed is used as a key for storing the combination of the Flow ID, username and authentication time. This is stored only for a short time.
- (AS/Client) Finally the user is redirected to the ../oauth/callback endpoint, again with the Flow ID but now also with the generated authorization code. This is separated from the ../auth/login/finish to neatly distinguish the steps belonging to the OAuth protocol and the selected authentication protocol. Here, the user is redirected using the state from the AuthRequest stored on the server and authorization code to the redirect_uri supplied in the initial request.
- (Client) At the redirect_uri (../auth/callback), the client checks their stored state with the state supplied in the redirect. If it doesn't match, the login aborts.
- (Client) Now, a TokenRequest is made:
{
"client_id": "dodekaweb_client",
"grant_type": "authorization_code",
"redirect_uri": ".../auth/callback",
"code": "a) CODE",
"code_verifier": "b) VERIFIER"
}
- a) CODE: The code that the user generated in the final authentication step (OPAQUE session key).
- b) VERIFIER: The unhashed original cryptographically random string generated by the client for the original AuthRequest
- (Client) This time, it is encoded as JSON and sent as a post request to the ../oauth/token endpoint.
- (AS) The supplied authorization code is used to fetch the Flow ID
Security considerations of non-OAuth steps
OAuth does not exactly define authentication, nor how to store certain state. We make a few assumptions that determine the security of the login process:
- The Flow ID (used to identify the OAuth AuthRequest throughout the entire authentication flow), the Auth ID (used to identify an OPAQUE login process, i.e. to retrieve server state generated in the first step for the second step) and the session key (used as the OAuth 'code', generated by OPAQUE) should all be sufficiently random and have enough entropy so that an attacker cannot guess a random value to intercept a login attempt.
- Critically, in the time frame that these are valid (1000 seconds for Flow ID and 60 seconds for Auth ID and session key), there should not be so many requests that an attacker can randomly guess a correct value. All these values are at least 10 bytes of random information, meaning there are more than 10^24 different values. The session key is even 32 bytes, making it even more difficult to guess within 60 seconds.
https://openid.net/specs/openid-connect-core-1_0.html#IDToken
Remembering session
Key rotation
Key management and rotation is not a trivial problem.
https://datatracker.ietf.org/doc/html/rfc7517#appendix-C.3 https://cryptography.io/en/latest/hazmat/primitives/asymmetric/serialization/#serialization-encodings https://www.rfc-editor.org/rfc/rfc8037.html#section-2
https://www.rfc-editor.org/rfc/rfc7518.html https://www.rfc-editor.org/rfc/rfc7517.html JWK
We will store the keys as an encrypted JSON Web Key Set, encrypted with a runtime key (from the dodeka secrets). Keys will be regenerated automatically.
The opaque setup value will also be rotated automatically.
At startup, the encrypted JSON Web Key Set will be extracted from the database. It will then be decrypted and the value replaced.
Developing
Prerequisites
Development isn't scary, but it's probably new. Lots of jargon and all kinds of tools will be thrown around. Just let it come, you can only learn by doing.
Below I'll introduce some concepts that are necessary to understand to develop for the website. Some of it you'll know about or heard of already.
File system
Servers
Linux
Command line
Version control / Git
Browsers / JavaScript
Package managers
NodeJS / npm
Cheatsheet
This contains the most important information to get you up and running and productive.
Git
Open the command line in the folder where you downloaded DSAV-Dodeka.github.io / open the terminal in VS Code (Ctrl+`)
General workflow
# go to the main branch
git checkout main
# update the repository
git pull
# go to a new branch (replace branchname with your desired name, no spaces or capital letters allowed)
git switch -c "branchname"
# add all edited files to future commit
git add -A
# commit the changes (change the description to something useful)
git commit -m "commit description"
# upload changes to github.com (replace 'branchname' with what you used earlier)
# in case you already pushed this branch before, you can just do git push
git push --set-upstream origin branchname
Status
See the current status (shows what branch you are on)
git status
Go to a branch
If you want to go to a particular branch, say 'branch-xyz', do:
git checkout branch-xyz
Update a branch
If you want to update your current branch with changes on github.com:
git pull
If you have changes locally, this might not work.
Delete all local changes (BE CAREFUL)
If you did some stuff you don't know how to revert, but also don't care to save it, do (be careful!):
git reset --hard
Frontend
Open the command line in the folder where you downloaded DSAV-Dodeka.github.io / open the terminal in VS Code (Ctrl+`)
Run the website locally
npm run dev
The website is now available in your browser at http://localhost:3000.
Update dependencies
npm install
Frontend development
For more in-depth details on why certain decisions were made, see the architecture section.
The frontend development can be divided roughly into three sections:
- Updating the content (don't modify pages, just the text and images). See here. For information on images and how to optimize all images, see here.
- Adding new static pages (design-focused page design). See here.
- Creating dynamic pages (and integrating them with the backend). See the section in the backend here.
React and React Router
We use a concept called "client-side routing", which means that code on the page itself sends you to the subpages. This is also known as a "single-page application". For more details, see the section on architecture.
Routes
Define a new route
To define a route, add a an element inside teh src/App.tsx
:
<Route path="/vereniging" element={<Vereniging />} />
Here <Vereniging />
is the React component that consists of the entire page. You will need to import it from the right file. Every single page, also every subpage, needs a separate route like this. path="/vereniging"
indicates the path at which the page will be visible.
Add it to the menu bar
The src/components/Navigation Bar/NavigationBar.jsx
file contains all the different menu items. Don't forget to add it to both the navItems and the navMobileContainer.
.tsx vs .jsx?
For new components, prefer .tsx
, which ensures proper TypeScript checking happens, which can make development easier by providing hints about what properties are available and also prevents bugs.
Authentication
We use React Context to make the authState
available everywhere. Inside the component, simply put:
const {authState, setAuthState} = useContext(authContext)
Then, by checking authState.isLoaded && authState.isAuthenticated
you can check whether someone has been authenticated as a member and whether the route should be available. Note: any data that you store on the frontend is publicly available (either through the source code, but also using 'inspect element' in the browser)! So any sensitive data should be stored in the backend and retrieved using requests. See the section on integrating the backend and frontend.
You can use authState.scope.includes("<role>")
to check if someone has a role, but remember someone can just edit this code in the browser. So any information available on pages stored in the frontend repository should nto be sensitive. So it's fine to simply display the page skeleton based on checking the authState, but don't show private data based on that.
Content
The content can be found primarily in ./src/content
. There you can find many JSON files. JSON (JavaScript Object Notation) is a format that can easily be read by a machine. In the actual pages, we import these files and then read them, putting the text actually on the website.
The images can be found in ./src/images
. Again, we import these images on pages using the getUrl
function.
Images
Optimizing images
This script only works on Linux (or WSL).
Dependencies
-
img-optimize - https://virtubox.github.io/img-optimize/ (
optimize.sh
) -
imagemagick - https://imagemagick.org/script/download.php (
convert
) -
jpegoptim
-
optipng
-
cwebp
The last 3 can be installed on Debian/Ubuntu using:
sudo apt install jpegoptim optipng webp
Once you've downloaded the first script, run the following script from the img-optimize main folder (be sure to replace <DSAV-Dodeka repository location>
by the correct path):
#!/bin/bash
# Script by https://christitus.com/script-for-optimizing-images/ (Chris Titus)
# Modified by Tip ten Brink
FOLDER="<DSAV-Dodeka repository location>/src/images"
#resize png or jpg to either height or width, keeps proportions using imagemagick
find ${FOLDER} -iname '*.jpg' -o -iname '*.png' -exec convert \{} -verbose -resize 2400x\> \{} \;
find ${FOLDER} -iname '*.jpg' -o -iname '*.png' -exec convert \{} -verbose -resize x1300\> \{} \;
find ${FOLDER} -iname '*.png' -exec convert \{} -verbose -resize 2400x\> \{} \;
find ${FOLDER} -iname '*.png' -exec convert \{} -verbose -resize x1300\> \{} \;
# Optimize.sh is the img-optimize script
./optimize.sh --std --path ${FOLDER}
We convert the images to a size of max 2400x1300, as higher resolutions don't make a big difference.
Backend development
Project structure
All the application code is inside the backend/src
directory. This contains five separate packages, of which three act as libraries and two as applications.
- The
datacontext
library is fully standalone. It contains special logic for implementing dependency injection, which is useful for replacing database-reliant functions in tests, while keeping good developer ergonomics. Ensure it doesn't import code from any other package! - The
store
library is fully standalone and provides the primitives for communicating with the databases (both DB and KV). Ensure it doesn't import code from any other package! - The
auth
library relies on both the datacontext and store libraries. It provides an application-agnostic implementation of all the authorization server logic. In an ideal world, the authorization server is a separate application. To still stay as close to this as possible, we develop it as a separate library. However, the library does not know about HTTP or anything like that, the routes are implemented in our actual implementation, as are some things which rely on a specific schema. - The
schema
package contains the definition of our database schema (inschema/model/model.py
). It can be extracted during deployment and then used for applying migrations, hence it is also something of an application. - The
apiserver
package is our actual FastAPI application. It relies on all the above four packages. However, it also has some internal logic that is more "library"-like. Furthermore, to prevent circular imports among other things, there is a certain "dependency order" we want to keep. They are as follows:resources.py
contains two variables that make it easier to get the specific path, specifically import files in theresources
folder.define.py
contains a number of constants that are unlikely to ever change and do not really depend on what environment the application is deployed in (whether it is development, staging, production, etc.). It also contains the logic for loading things that do depend on the 'general' environment, but not the 'local' environment. As a rule of thumb, something like a website URL will always be the same for an environment, but an IP address, a port or a password might differ.env.py
loads this local configuration, which includes things like passwords and where to exactly find the database.- Then we have the
src/apiserver/lib
module, which consists mostly of logic that does not load its own data. While it might cause side effects (like sending an email), it should always cause the same side effects for the same arguments (so it should not load data). In general, most functions and logic here should be pure. More importantly, they should not import anything from thesrc/apiserver/app
module. - Next there is
src/apiserver/data
. This include all the simple functions that perform a single action relating to external data (so the DB or KV). Mostly, these functions wrapstore
functions, but then using a specific table or schema. The most important are the functions in thedata/api
, i.e. the data "API" which is the way that the rest of the application interacts with data. Insidedata/context
it also contains context functions, which should call multipledata/api
functions and other effectful code that you wan to easily replace in test (like generating something randomly). Seedata/context/__init__.py
for more details. - Finally, we come to
src/apiserver/app
. These contain the most critical part, namely therouters
, which define the actual API endpoints. Furthermore, there is themodules
module, which mostly wrap multiple context functions. Seeapp/modules/__init__.py
for more details. - Next, the
app_...
files define and instantiate the actual application, whiledev.py
is an entrypoint for running the program in development.
Other
Important to keep in mind
Always add a trailing "/" to endpoints.
Testing
We have a number of tests in the tests
directory. To run them and check if you didn't break anything important, you can run poetry run pytest
.
Static analysis and formatting
To improve code quality, readability and catch some simple bugs, we use a number of static analysis tools and a formatter. We use the following:
mypy
checks if our type hints check out. Run usingpoetry run mypy
. This is the slowest of all the tools.ruff
is a linter, so it checks for common mistakes, unused imports and other simple things. Run usingpoetry run ruff src tests actions
. To automatically fix issues, add--fix
.black
is a formatter. It ensures we never have to discuss formatting mistakes, we just let the tool handle it for us. You can usepoetry run black src tests actions
to run it.
You can run all these tools at once using the Poe
taskrunner, by running the following in the terminal:
poe check
Continuous Integration (CI)
Tests (including some additional tests that run against a live database) and all the above tools are all run in GitHub actions. If you open a Pull Request, these checks are run for every commit you push. If any fail, the "check" will fail, indicating that we should not merge.
VS Code settings
VS Code doesn't come included with all necessary/useful tools for developing a Python application. Therefore, be sure the following are installed:
- Python (which installs Pylance)
- Even Better TOML (for .toml file support)
You probably want to update .vscode/settings.json
as follows:
{
"python.analysis.typeCheckingMode": "basic",
"files.associations": {
"*.toml.local": "toml"
},
"files.exclude": {
"**/__pycache__": true,
"**/.idea": true,
"**/.mypy_cache": true,
"**/.pytest_cache": true,
"**/.ruff_cache": true
}
}
This ensures that any unnecessary and files are not shown in the Explorer.
Routes
authpage
The authpage
is the tiny React webpage that is used for logging in. It is served directly by the backend server and can be found in the backend repository.
Schema
Migrations
We can use Alembic for migrations, which allow you to programatically apply large schema changes to your database.
First you need to have the Poetry environment running as described earlier and ensure the database is on as well.
- Navigate to the
./backend/src/schema
directory.
Ensuring the database is in sync with the migrations
The migrations are stored in the schema/model/versions
folder. First, make sure your database has the same schema as the latest revision.
- From there run
poetry run alembic revision --autogenerate -m "Some message"
- This will generate a Python file in the migrations/versions directory, which you can view to see if everything looks good. It basically looks at the database, looks at the schema described in db/model.py and generates code to migrate to the described schema.
- Then, you can run
poetry run alembic upgrade head
, which will apply the latest generated revision. If you now use your database viewer, the table will have hopefully appeared. - If there is a mismatch with the current revision, use
poetry run alembic stamp head
before the above 2 commmands.
Integrating the backend/frontend
The database is the only place you can securely store private information. Everything stored in the repositories can easily be accessed by anyone. In the future, we might want to make an easier way to store private content.
So, if you want to display some secret information on the backend on the frontend, you will need to load it using an HTTP request. To make this easier, we use two libraries, TanStack Query
and ky
.
A query
A "query" is basically an automatic function that, once the page loads, will load whatever function you ask it to and keep it up to date. It can be enabled based on whether or not someone is authenticated.
An example of of a query is (defined in src/functions/queries.ts
):
export const useAdminKlassementQuery = (
au: AuthUse,
rank_type: "points" | "training",
) =>
useQuery(
[`tr_klass_admin_${rank_type}`],
() => klassement_request(au, true, rank_type),
{
staleTime: longStaleTime,
cacheTime: longCacheTime,
enabled: au.authState.isAuthenticated,
},
);
Here, useQuery
is a function from the TanStack Query library, while klassement_request
is defined by us. Here it is important that the tr_klass_admin_${rank_type}
key is unique, otherwise the caches will not work correctly.
Let's look at klassement_request
(inside src/functions/api/klassementen.ts
):
export const klassement_request = async (
auth: AuthUse,
is_admin: boolean,
rank_type: "points" | "training",
options?: Options,
): Promise<KlassementList> => {
# ... ommitted for brevity
let response = await back_request(
`${role}/classification/${rank_type}/`,
auth,
options,
);
const punt_klas: KlassementList = KlassementList.parse(response);
# ... ommitted for brevity
return punt_klas;
};
Here, back_request
is a function we defined, which calls the backend using the ky
library. Basically, this is just a simple GET request, which we then parse using zod
(the .parse
part). The backend will check the information in the auth
part, returning the data if you have the right scope and are the right user.
How is the result of this query used now? Let's see (in src/pages/Admin/components/Klassement.tsx
):
const q = useAdminKlassementQuery({ authState, setAuthState }, typeName);
const pointsData = queryError(
q,
defaultTraining,
`Class ${typeName} Query Error`,
);
The pointsData
now simply contains the data you want. All the data loading happens in the background. queryError
is also defined by us and ensures that any potential error is properly caught. If the data is still loading, it will display the default data instead (defaultTraining
) in this case. In the future we might want to make sure this is displayed in a nicer way in the UI. Because right now, it will first show the default data, before flickering and switching to the loaded data once it comes in.
Developing the deployment setup
Deployment
This section discusses deployment, so everything to do with getting the code actually live so that the application can actually be used.
Source
Building the scripts and containers
- Poetry
- Once installed, run
poetry install --sync
inside the main directory. This will install the other requirements.
- Once installed, run
Deploy scripts
Building the deployment scripts is easy, simply run build_deploy.sh
in the main directory.
Containers
The containers have dedicated GitHub Actions workflows to build them, so in general you should never have to build them locally. Take a look at the workflows to see how they are built.
Server
Our server is hosted by Hetzner, a German cloud provider. Our server is an unmanaged Linux Ubuntu virtual machine (VM). VM means that we do not have a full system to ourselves, but share it with other Hetzner customers. We have access to a limited number of cores, memory and storage.
It is unmanaged because we have full control over the operating system. We need to keep it up to date ourselves. The choice for Ubuntu was also made by us. It has no GUI, only a command line, so getting familiar with the Unix command line is very helpful. By default, it uses the bash
shell.
Webportal
The webportal for the server can be accessed from https://console.hetzner.cloud. The account we use is studentenatletiek@av40.nl. You need 2FA to log in.
The most important things you can do from the portal are:
- See graphs of CPU, disk and network load, as well as memory usage
- Manage backups
- Access to root console
Connecting: SSH
To connect to the server, we use SSH. To be able to connect, you need to have an "SSH key" configured. To add one, you must first generate a private-public SSH keypair.
Then, the public part must be added to the ~/backend/.ssh/authorized_keys
file. Note, this file requires some specific permissions, so if something is not working check whether these are correct.
Currently, we have the following important SSH settings:
PermitRootLogin no
PasswordAuthentication no
This ensure you can only log in with an SSH key, not with a simple user password.
We only allow logins through the backend
user (see next section), so keep /root/.ssh/authorized_keys
empty.
It is recommended to not add SSH keys through the web console, as these are not easily visible inside the authorized_keys file.
If you no longer have access to any keys, use the web console to log in as root, then change user to backend su backend
and edit the authorized_keys file.
Connecting
Connecting is simple, simply do ssh backend@<ip address>
. If your default identity is not a key that has access, you might need to use the -i
flag to select the right key on your client.
Once you have done this, you have access to the server as if it's your PC's own command line.
root
vs backend
user
To keep things safe, try to avoid using the root
user as much as possible. Instead, use backend
. You can use sudo
to run priviledged commands and and if necessary, log in as root using su root
.
Keeping it up to date
To keep the server up to date, ocassionally run:
sudo apt update
sudo apt install
Ocasionally, Ubuntu itself might also get an update. It is best to only update once there is a new LTS version.
Required packages
For running deployment scripts, three main tools must be installed, poetry, Docker (including Docker Compose) and the GitHub CLI. Make sure these are occasionally updated.
Furthermore, the server also requires nginx as a reverse proxy and certbot for SSL certificates. We use Ubuntu's packaged nginx and we use a snap package for certbot.
File locations
Currently, mailcow is in /opt/mailcow-dockerized and the dodeka repository is in /home/backend/dodeka.
Environments
The dodeka
repository builds Docker containers (the build/container
directory) and builds the scripts for deploying them (build/deploy
). Both are differentiated based on the "environment" or "mode" of deployment. We distinguish the following:
- 'production' mode
- 'staging' mode
- 'test' mode (not yet in use)
- 'localdev' mode
There is also 'envless', for when you run tests without actually spinning up the entire application.
The DB (database, PostgreSQL) and KV (key-value store, Redis) are designed to vary very little depending on their mode, accepting simple configuration options and allowing to be wrapped by simple scripts to handle different modes.
The Server and Pages have more significant differences between modes.
In general, modes are pre-selected for deploy builds, but for container builds they are selected at build time (in CI). Furthermore, deploy builds can generally be run locally, while container builds are run in CI to ensure reproducability.
Staging and production
For a complete setup including backend, first ensure the containers are built using GitHub Actions for the environment you want to deploy. Then, SSH into the cloud server you want to deploy it to.
The following tools are necessary, in addition to those assumed to be installed on a standard Linux server:
- Python 3.10 (matching the version requirements of the
dodeka
repository) - Poetry (to install the dependencies in the
dodeka
repository) gh
(GitHub CLI, logged into account with access tododeka
repository, optional if authenticated by other means)nu
(a shell scripting language, see also how to install.)tidploy
(tool built specifically for deploying this project)bws
(Bitwarden Secrets Manager CLI)
The last three tools are all Rust projects, so they can be built from source using cargo install <tool name>
. However, this can be very slow on large VMs, so installing them as binaries using cargo binstall <tool name>
is recommended (install binstall
first by using cargo install cargo-binstall
).
To deploy, simply clone this repository and enter the main directory. Make sure you have updated the repository recently with the newest deploy script versions. Then, run tidploy auth deploy
and enter the Bitwarden Secrets Manager access token.
Then, you can deploy using tidploy deploy production
(or tidploy deploy staging
for staging). You can also use the Nu shortcut:
nu deploy.nu create production
If you want to specify the specific Git tag (i.e. a release or commit hash) to deploy, use:
nu deploy.nu create staging v2.1.0-rc.3
That's it!
Shutdown
You can observe the active compose project using docker compose ls
. Then you can shut it down by running (from the dodeka
repository):
nu deploy.nu down production
If the suffix of the Docker Compose project name is different from latest, replace it with that. It will in general be equal to the tag you deployed with (which is 'latest' by default), but with periods replaced with underscores. For example, nu deploy.nu create staging v2.1.0-rc.3
can be shutdown with nu deploy.nu down staging v2_1_0-rc_3
.
Setting up the server from scratch
This is all the commands I used when I set up the server from a clean image on January 10 after we migrated the email to the new provider.
Preparing SSH
First we rebuild the image from Ubuntu 22.04.
We reset the root password using Rescue -> Reset root password. I recommend then changing it to a new password once inside again (through passwd root
).
Login using the console GUI.
Go to /etc/ssh
and change the SSH server settings sshd_config
:
PermitRootLogin no
PasswordAuthentication no
Then we create a new user (-m creates home directory, then we add them to sudo group):
useradd -m backend
adduser backend sudo
Then do passwd backend
and set up a password.
Switch to the user using su backend
.
The default shell might not be bash (for example if your prompt starts with only '$'), in that case run:
chsh --shell /bin/bash
Create a new directory mkdir /home/backend/.ssh
. Enter this directory (using cd
) and then do nano authorized_keys
to open/create a new file there.
Paste in your SSH public key (!) (something like ssh-ed25519 .... tiptenbrink@tipc
) and save the file (Ctrl-X).
Then, ensure the file has the correct permissions:
chmod 700 ~/.ssh && chmod 600 ~/.ssh/authorized_keys
SSH niceties
Install "xauth" (while logged in as root)
apt install xauth
Then, login to backend (su backend
) and log out again. If you're using a nice terminal emulator, like kitty, you need to add the xterm file to the server. To do this, one time append kitty +kitten
before your ssh command like:
kitty +kitten ssh backend@<ip here>
After that you can just login normally. Other terminal emulators might need other instructions.
From this point on we never need to be logged in as root anymore!
Update packages
sudo apt update
sudo apt upgrade
You might have to reboot after this: reboot
.
Install basic C ompiler
sudo apt install build-essential
Install Rust
Go to https://rustup.rs/
Run the listed script. Choose "Customize", then profile "minimal".
Install cargo binary install tools
cargo install cargo-quickinstall
cargo quickinstall cargo-binstall
Install nu, tidploy
cargo binstall nu
cargo binstall tidploy
Install GH CLI
Follow these instructions.
Login to DodekaComCom on GitHub
Using its password.
Setup gh
Login using gh auth login
, then use a correctly scoped auth token that you got from the DodekaComCom account.
Add backend
user to Docker group
sudo usermod -aG docker backend
Then logout and back in again.
Login to ghcr.io
docker login ghcr.io
Use another access token, this one only has to read the org and have access to packages.
Clone dodeka
gh repo clone dsav-dodeka/dodeka
Set tidploy auth key
Now, ensure all necessary secrets are accessible by the access token you're going to set. Then enter the dodeka
repository and do:
tidploy auth bws
Then enter the access token.
Deploy
Next, run:
tidploy deploy -d deploy/use/production
This will start the backend and database.
Optional: restore database
In case you have all the files from the volume that contained the database, you want to restore these. First, get them to the server, for example using scp
. We assume we have a folder called backup
in our current directory that contains all the Postgres files. Ensure the database is down again (using docker compose -p dodeka-production-latest down
).
Then you can do:
docker run --rm -it -v ./backup:/from -v dodeka-db-volume-production-latest:/to alpine ash
This will put you into a container with the recently created, empty database at /to
and the backup at /from
. First, clean out the new folder using rm -rf *
while inside the /to
folder (don't do this in the container root directory!).
Then, you can run:
cd /from ; cp -av . /to
Now, restart the database. Everything should work now.
Setup nginx
Install it:
sudo apt install nginx
Start it:
sudo systemctl start nginx
For some extra details also see the full section on nginx and certificates.
Setup non-HTTPS config
Go to /etc/nginx
. Every file here is root-protected, so use sudo
before each of the following commands:
Go to the sites-available
subdirectory.
Do nano api.dsavdodeka.nl
and paste the following basic config:
server {
root /var/www/api.dsavdodeka.nl/html;
index index.html index.htm index.nginx-debian.html;
server_name api.dsavdodeka.nl;
location / {
# First attempt to serve request as file, then
# as directory, then fall back to displaying a 404.
proxy_pass http://localhost:4241;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection 'upgrade';
proxy_set_header Host $host;
proxy_cache_bypass $http_upgrade;
}
}
server {
listen 80;
}
Create a symlink from the available to enabled:
sudo ln -s /etc/nginx/sites-available/api.dsavdodeka.nl /etc/nginx/sites-enabled/api.dsavdodeka.nl
If necessary, restart nginx:
sudo systemctl restart nginx
If you go to http://api.dsavdodeka.nl (not https) you should get "Hallo: Atleten"!
Certbot/Let's Encrypt
Install snap
sudo apt install snapd
Install certbot
sudo snap install --classic certbot
Run certbot
sudo certbot --nginx
First, enter your e-mail. Then it will give you a list of domains you want to install the certificate for, choose the number indicating api.dsavdodeka.nl
(probably 1).
Cleanup nginx config
You probably want to clean up your nginx config.
There might be a server block saying only:
server {
listen 80;
}
Delete this, the Certbot block should handle this now.
If necessary, restart nginx:
sudo systemctl restart nginx
Optional: install Python and Poetry
You can create database backups and migrate the database using Python. First, we need to install a Python version that has the same major version as the backend server requires.
To make it more easy to install new versions in the future, let's use pyenv
. I recommend not installing using homebrew, as that might interfere with some other core packages. Instead, use their install script and follow the instructions to put it into the path. These were, when last checked:
curl https://pyenv.run | bash
To add it to path and load it automatically: add to .bashrc
(in the server home folder):
export PYENV_ROOT="$HOME/.pyenv"
[[ -d $PYENV_ROOT/bin ]] && export PATH="$PYENV_ROOT/bin:$PATH"
eval "$(pyenv init -)"
Then, install the correct Python version using:
pyenv install <exact version>
Note that it will install Python from source, so this could take a while. If there is an error, take a look at all required packages that must be installed.
Then, go into the project directory and run pyenv local <exact version>
. Now, the Python version should be the correct one if you run python
.
Next, we will install Poetry to manage our dependencies. I recommend using pipx
(which you can just install using sudo apt install pipx
), so pipx install poetry
.
Then, we want to make our Poetry environment use the correct version. Most likely, the Python version was installed into: ~/.pyenv/versions/<version>/bin/python
, so then you can use (once you are in the backend/src
directory):
poetry env use ~/.pyenv/versions/<version>/bin/python
Now, we can run commands in our envrionment using poetry run <command>
.
Fin
That was it, with less than 300 lines of instructions can completely set up a Linux server from scratch, in a simple and secure way.
nginx and SSL certificates
When we start the application, everything will simply be accessible from localhost. However, localhost is not accessible outside the server. Instead, we use a so-called "reverse proxy" to allow access to the application. This reverse proxy also ensures we use can TLS (which is so you can have that important https link, which means all communication is encrypted).
nginx configuration
The configuration for nginx is stored in /etc/nginx/sites-available
:
server {
root /var/www/api.dsavdodeka.nl/html;
index index.html index.htm index.nginx-debian.html;
server_name api.dsavdodeka.nl;
location / {
# First attempt to serve request as file, then
# as directory, then fall back to displaying a 404.
proxy_pass http://localhost:4241;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection 'upgrade';
proxy_set_header Host $host;
proxy_cache_bypass $http_upgrade;
}
listen [::]:443 ssl ipv6only=on; # managed by Certbot
listen 443 ssl; # managed by Certbot
ssl_certificate /etc/letsencrypt/live/api.dsavdodeka.nl/fullchain.pem; # managed by Certbot
ssl_certificate_key /etc/letsencrypt/live/api.dsavdodeka.nl/privkey.pem; # managed by Certbot
include /etc/letsencrypt/options-ssl-nginx.conf; # managed by Certbot
ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem; # managed by Certbot
}
server {
if ($host = api.dsavdodeka.nl) {
return 301 https://$host$request_uri;
} # managed by Certbot
listen 80;
listen [::]:80;
server_name api.dsavdodeka.nl;
return 404; # managed by Certbot
}
Everything that says "# managed by Certbot
" we can ignore. The important part is the location
block. It basically says to forward http://localhost:4241, which is the URL of our application, to the outside world at port 80 (and 443 for TLS). The server_name
is also important, because this tells nginx to only forward it if these requests come from api.dsavdodeka.nl
. The rest is all unchanged from the defaults.
The file in sites-available
is not the one actually read, instead it reads from sites-enabled
. The latter is actually linked so that it automatically updates based on the former. This can be achieved using a command like:
sudo ln -s /etc/nginx/sites-available/api.dsavdodeka.nl /etc/nginx/sites-enabled/api.dsavdodeka.nl
Certbot (Let's Encrypt)
TLS certificates are necessary to have an encrypted https URL. These are given out by special providers. We use the Cerbot tool to get one from Let's Encrypt. First the site must be available at a normal http link on port 80. Then, after running Certbot (see instructions), it makes sure you can use https and modifies the nginx configuration to make that happen.
Secrets
Passwords
- Hetzner account studentenatletiek@av40.nl
- root user password of the server
- backend user password of the server
- Everything in
dodekasecrets
(Postgres password, Redis password, server secret) and the passphrase to encrypt these secrets - Symmetric encryption key for refresh tokens and asymmetric signing key for access and ID tokens (stored in database)
Database
Upgrade Postgres major version
First, prepare the new image with the new version of Postgres. Then, get both containers to run on the same machine.
For this, deploy using use/repl
, ensuring that all the settings are what you want the new database to have. The password can be set at every start, so that doesn't matter. Later, we will change the volume name before running it the normal way.
Next log into the old database, e.g. using docker exec -it -w /dodeka-db d-dodeka-db-1 /bin/bash
. Here 'd-dodeka-db-1' is the container name of the old database and '-w /dodeka-db' means we enter the main DB directory.
Then, use pg_dumpall
, where '3141' is the local port it's running on and 'dodeka' is the main db user:
pg_dumpall -p 3141 -U dodeka > ./upgrade_dump.sql
Ensure that after this dump is made, no more changes are made to the db, as these will be lost. Best is to shut off any external access (e.g. by shutting down the webserver for a short moment).
Now, you must get the file upgrade_dump.sql
to the new database, which you should already have initialized.
If you leave the container, you can use docker cp
, for example like this:
docker cp d-dodeka-db-1:/dodeka-db/upgrade_dump.sql ./upgrade_dump.sql
Here we again assume the old container is named 'd-dodeka-db-1', the dump file path is '/dodeka-db/upgrade_dump.sql' inside the container and you want to copy it to your local folder. Now, we copy it to the new container (immediately deleting the file locally):
docker cp ./upgrade_dump.sql d-dodeka_repl-db-1:/dodeka-db/upgrade_dump.sql && rm ./upgrade_dump.sql
Notice the other container's name is 'd-dodeka_repl-db-1'. We must now again enter the new database and restore the data.
We do this using the following command (once we have entered with docker exec -it -w /dodeka-db d-dodeka_repl-db-1 /bin/bash
):
psql -p 3141 -U dodeka -d postgres -f ./upgrade_dump.sql
Note that this will also restore the passwords as previously set, so login to check if everything is there with the previous password.
Finally, we need to replace the old container's Docker volume by the new one. There are no great solutions for this. First, delete the old volume and recreate it with Docker Compose, like:
docker volume rm d-dodeka-db-volume-production
You want to recreate the volume with Compose, because it will give a warning if done manually. A way to do this simply run deploy.sh
on the database, which will automatically create a volume if none existed.
# Copies from your new db's volume to the old one using a temporary container
docker run --rm -it -v d-dodeka_repl-db-volume-production:/from -v d-dodeka-db-volume-production:/to alpine ash -c "cd /from ; cp -av . /to"
# Since our old volume name now contains the new one's data, we can delete the new one and reuse the old one
docker volume rm d-dodeka_repl-db-volume-production
Now, ensure your old deployment uses the new image and restart it. Everything should work then.
Recap
# Update dodeka repo on server
# Deploy repl database
cd use/repl
./repldeploy.sh
# Turn off database access (shut down apiserver)
# Enter database
docker exec -it -w /dodeka-db d-dodeka-db-1 /bin/bash
# Dump all
pg_dumpall -p 3141 -U dodeka > upgrade_dump.sql
# Copy from old database to local
docker cp d-dodeka-db-1:/dodeka-db/upgrade_dump.sql ./upgrade_dump.sql
# Copy from local to repl database
docker cp ./upgrade_dump.sql d-dodeka_repl-db-1:/dodeka-db/upgrade_dump.sql && rm ./upgrade_dump.sql
# Enter repl database
docker exec -it -w /dodeka-db d-dodeka_repl-db-1 /bin/bash
# Restore database
psql -p 3141 -U dodeka -d postgres -f ./upgrade_dump.sql
# Delete dump file
rm ./upgrade_dump.sql
Migrate database schema
First, log in to the production server and enter the dodeka
repository.
The schema is then in the backend/src/schema
directory.
To load env var, use:
read -s -p $'Enter POSTGRES_PASSWORD:\n' POSTGRES_PASSWORD
Then, be sure you have a GitHub token ready from an account with access to the backend
repository, with at least
read:org
and repo
scope (preferably the DodekaComCom account).
Then, from the main repo directory, run:
./use/data_sync/migrate_env.sh
This will prompt you for the token. Paste it in and press enter. A new migrate
directory will have appeared, which has been copied from the backend
repository (the src/schema
package, to be precise).
Now, run alembic:
poetry run alembic revision --autogenerate -m "<Some migration message>"
For troubleshooting, refer to the backend setup docs.