Skip to content

Commit d0d2993

Browse files
committed
Homelab: Full architecture redesign (#486)
Complete redesign of my Homelab to switch to a much simpler, lightweight, secure and resource efficient architecture design: - Replace Proxmox on physical hosts by a lightweight Arch Linux install that only runs basic services like `sshd`, `firewalld`, `fail2ban`, etc..., as well as rootless LXC & Podman (see next points) [1]. - Replace VMs by rootless / unprivileged LXC System containers (acting "VM-Like", with dedicated hostname, IP, resources quota limit, direct SSH access, etc...). - Replace Docker by rootless / unprivileged Podman Application containers. - Application containers (Podman) run directly on the physical hosts (rather than in dedicated VMs as before). - Tasks initially managed via Proxmox (snapshots, backups, etc...) are handled via Ansible (in addition of the "classic" administration tasks, system & containers updates, etc...). The Ansible architecture is also reworked / simplified. - Every rootless / unprivileged containers are all executed via a dedicated & unprivileged user, with secured login mechanisms (no ssh access, no password authentication, no `su - user`, etc... only `sudo -iu user` / `sudo machinectl shell user@` from another *privileged* user). - Day to day administration tasks, log viewing, etc... are done from a separate / classic user. - HA for critical services is managed on the software side by running clusters with the different nodes (LXC Containers) spared across the different physical hosts and a floating VIP managed at the service-level by `keepalived`. This should result in a much simpler, pragmatic, secure & resource efficient architecture design overall; though at the cost of handy / "QOL" features such has the centralized Proxmox WebUI, one click snapshot / backups, built-in HA fencing, live-migration, etc... However, some of those handy features can (and will) be replaced by custom scripting / Ansible playbooks (e.g. for snapshots / backups). VMs are a bit more flexible and have a better isolation by design, but they are also more complex and way less resource efficient than containers. As for other stuff like HA fencing, live-migration and so on... while they are handy, they also imply more or less technical overhead. For instance, I need to run a third quorum node from a RaspberryPI to get proper HA from Proxmox, CEPHS storage is resource intensive, cluster management may complicate things in specific situation (e.g. when upgrading / reinstalling nodes to a new major version), etc... And it's fair to say that I don't really *need* those things for my Homelab. All and all, this is treading (some) convenience for simplicity (architecture wise), with everything it implies: lighter on resources, less attack surfaces, lighter maintenance, ... at the cost of loosing certain extra features or having to rely on custom solutions for those. It also a fun experiment that I'm looking forward to try. I *might* write a blog post with my feedback about this change after some times. 🙂 [1] I was initially considering Alpine Linux instead at first, which felt like the perfect candidate for such tiny / "appliance like" servers. But, unfortunately, it turns out that I won't be able to properly run everything I want / need, mostly due to lack of systemd ecosystem support. For instance, `podman auto-update` hardly requires the containers to run from systemd services. I also haven't been able to start / run Arch Linux (or any distribution running systemd >= v258, which therefore strictly requires cgroups v2) in LXC unprivileged containers, presumably because `OpenRC` doesn't handle cgroups delegation, which is a deal breaker for me.
1 parent 8613aec commit d0d2993

243 files changed

Lines changed: 1792 additions & 556 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/workflows/CI.yml

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -15,13 +15,15 @@ jobs:
1515
run: actionlint -ignore 'label "CI-CD" is unknown' .github/workflows/CI.yml
1616

1717
- name: Run codespell
18-
run: codespell --enable-colors
18+
run: codespell --enable-colors -L chage
1919

2020
- name: Run mdl
2121
run: mdl --style .github/workflows/mdl_style.rb .
2222

2323
- name: Run ansible-lint
24-
run: ansible-lint -q --force-color --config-file .github/workflows/ansible-lint.conf --project-dir Ansible/playbooks/
24+
run: ansible-lint -q --force-color
25+
working-directory: Ansible
2526

2627
- name: Run ansible-inventory
27-
run: error=$( { ansible-inventory --list --yaml -i Ansible/inventories/ > /dev/null; } 2>&1 ); [ -n "${error}" ] && echo "${error}" && exit 1 || exit 0
28+
run: error=$( { ansible-inventory --list --yaml > /dev/null; } 2>&1 ); [ -n "${error}" ] && echo "${error}" && exit 1 || exit 0
29+
working-directory: Ansible

Ansible/Jenkins_Pipeline_Update_Infra.groovy

Lines changed: 29 additions & 125 deletions
Original file line numberDiff line numberDiff line change
@@ -2,228 +2,132 @@ pipeline {
22
agent any
33

44
stages {
5-
stage('Update Docker Containers - VMs - Dev') {
5+
stage('Update Podman Containers') {
66
steps {
77
script {
88
def result = build(
9-
job: 'Update_Docker_Containers',
9+
job: 'update_podman_containers',
1010
parameters: [
11-
string(name: 'SERVER_TYPE', value: 'VMs'),
12-
string(name: 'ENV', value: 'dev'),
13-
string(name: 'SERVERS', value: 'all'),
11+
string(name: 'SERVERS', value: 'podman'),
1412
string(name: 'DANGLING', value: 'true')
1513
],
1614
propagate: true,
1715
wait: true
1816
)
1917
if (result == 'FAILURE') {
20-
error("Update Docker Containers - VMs - Dev failed. Aborting pipeline.")
18+
error("Update Podman Containers failed. Aborting pipeline.")
2119
}
2220
}
2321
}
2422
}
2523

26-
stage('Update Docker Containers - VMs - Prod') {
24+
stage('Update Servers - LXC Core01') {
2725
steps {
2826
script {
2927
def result = build(
30-
job: 'Update_Docker_Containers',
28+
job: 'update_servers',
3129
parameters: [
32-
string(name: 'SERVER_TYPE', value: 'VMs'),
33-
string(name: 'ENV', value: 'prod'),
34-
string(name: 'SERVERS', value: 'all'),
35-
string(name: 'DANGLING', value: 'true')
36-
],
37-
propagate: true,
38-
wait: true
39-
)
40-
if (result == 'FAILURE') {
41-
error("Update Docker Containers - VMs - Prod failed. Aborting pipeline.")
42-
}
43-
}
44-
}
45-
}
46-
47-
stage('Update Docker Containers - VPS - Prod') {
48-
steps {
49-
script {
50-
def result = build(
51-
job: 'Update_Docker_Containers',
52-
parameters: [
53-
string(name: 'SERVER_TYPE', value: 'VPS'),
54-
string(name: 'ENV', value: 'prod'),
55-
string(name: 'SERVERS', value: 'all'),
56-
string(name: 'DANGLING', value: 'true')
57-
],
58-
propagate: true,
59-
wait: true
60-
)
61-
if (result == 'FAILURE') {
62-
error("Update Docker Containers - VPS - Prod failed. Aborting pipeline.")
63-
}
64-
}
65-
}
66-
}
67-
68-
stage('Update Docker Containers - Proxmox - Prod') {
69-
steps {
70-
script {
71-
def result = build(
72-
job: 'Update_Docker_Containers',
73-
parameters: [
74-
string(name: 'SERVER_TYPE', value: 'Proxmox'),
75-
string(name: 'ENV', value: 'prod'),
76-
string(name: 'SERVERS', value: 'all'),
77-
string(name: 'DANGLING', value: 'true')
30+
string(name: 'SERVERS', value: 'lxc_core01')
7831
],
7932
propagate: true,
8033
wait: true
8134
)
8235
if (result == 'FAILURE') {
83-
error("Update Docker Containers - Proxmox - Prod failed. Aborting pipeline.")
36+
error("Update Servers - Core01 failed. Aborting pipeline.")
8437
}
8538
}
8639
}
8740
}
8841

89-
stage('Update Servers - VMs - Dev - All') {
42+
stage('Update Servers - LXC Core02') {
9043
steps {
9144
script {
9245
def result = build(
93-
job: 'Update_Servers',
46+
job: 'update_servers',
9447
parameters: [
95-
string(name: 'SERVER_TYPE', value: 'VMs'),
96-
string(name: 'ENV', value: 'dev'),
97-
string(name: 'SERVERS', value: 'all')
48+
string(name: 'SERVERS', value: 'lxc_core02')
9849
],
9950
propagate: true,
10051
wait: true
10152
)
10253
if (result == 'FAILURE') {
103-
error("Update Servers - VMs - Dev - All failed. Aborting pipeline.")
54+
error("Update Servers - Core02 failed. Aborting pipeline.")
10455
}
10556
}
10657
}
10758
}
10859

109-
stage('Update Servers - VMs - Prod - Pmx01') {
60+
stage('Update Servers - VPS') {
11061
steps {
11162
script {
11263
def result = build(
113-
job: 'Update_Servers',
64+
job: 'update_servers',
11465
parameters: [
115-
string(name: 'SERVER_TYPE', value: 'VMs'),
116-
string(name: 'ENV', value: 'prod'),
117-
string(name: 'SERVERS', value: 'pmx01')
66+
string(name: 'SERVERS', value: 'vps')
11867
],
11968
propagate: true,
12069
wait: true
12170
)
12271
if (result == 'FAILURE') {
123-
error("Update Servers - VMs - Prod - Pmx01 failed. Aborting pipeline.")
72+
error("Update Servers - VPS failed. Aborting pipeline.")
12473
}
12574
}
12675
}
12776
}
12877

129-
stage('Update Servers - VMs - Prod - Pmx02') {
78+
stage('Update Servers - Rasp') {
13079
steps {
13180
script {
13281
def result = build(
133-
job: 'Update_Servers',
82+
job: 'update_servers',
13483
parameters: [
135-
string(name: 'SERVER_TYPE', value: 'VMs'),
136-
string(name: 'ENV', value: 'prod'),
137-
string(name: 'SERVERS', value: 'pmx02')
84+
string(name: 'SERVERS', value: 'rasp')
13885
],
13986
propagate: true,
14087
wait: true
14188
)
14289
if (result == 'FAILURE') {
143-
error("Update Servers - VMs - Prod - Pmx02 failed. Aborting pipeline.")
90+
error("Update Servers - Rasp failed. Aborting pipeline.")
14491
}
14592
}
14693
}
14794
}
14895

149-
stage('Update Servers - VPS - Prod - All') {
96+
stage('Update Servers - Core02') {
15097
steps {
15198
script {
15299
def result = build(
153-
job: 'Update_Servers',
100+
job: 'update_servers',
154101
parameters: [
155-
string(name: 'SERVER_TYPE', value: 'VPS'),
156-
string(name: 'ENV', value: 'prod'),
157-
string(name: 'SERVERS', value: 'all')
102+
string(name: 'SERVERS', value: 'core02.rc')
158103
],
159104
propagate: true,
160105
wait: true
161106
)
162107
if (result == 'FAILURE') {
163-
error("Update Servers - VPS - Prod - All failed. Aborting pipeline.")
108+
error("Update Servers - Core02 failed. Aborting pipeline.")
164109
}
110+
input("Proceed with Update Servers - Core01?")
165111
}
166112
}
167113
}
168114

169-
stage('Update Servers - Rasp - Prod - All') {
115+
stage('Update Servers - Core01') {
170116
steps {
171117
script {
172118
def result = build(
173-
job: 'Update_Servers',
119+
job: 'update_servers',
174120
parameters: [
175-
string(name: 'SERVER_TYPE', value: 'Rasp'),
176-
string(name: 'ENV', value: 'prod'),
177-
string(name: 'SERVERS', value: 'all')
121+
string(name: 'SERVERS', value: 'core01.rc')
178122
],
179123
propagate: true,
180124
wait: true
181125
)
182126
if (result == 'FAILURE') {
183-
error("Update Servers - Rasp - Prod - All failed. Aborting pipeline.")
127+
error("Update Servers - Core01 failed. Aborting pipeline.")
184128
}
185129
}
186130
}
187131
}
188-
189-
stage('Update Servers - Proxmox - Prod - Pmx02') {
190-
steps {
191-
script {
192-
def result = build(
193-
job: 'Update_Servers',
194-
parameters: [
195-
string(name: 'SERVER_TYPE', value: 'Proxmox'),
196-
string(name: 'ENV', value: 'prod'),
197-
string(name: 'SERVERS', value: 'pmx02')
198-
],
199-
propagate: true,
200-
wait: true
201-
)
202-
if (result == 'FAILURE') {
203-
error("Update Servers - Proxmox - Prod - Pmx02 failed. Aborting pipeline.")
204-
}
205-
input("Proceed with Update Servers - Proxmox - Prod - Pmx01?")
206-
}
207-
}
208-
}
209-
210-
stage('Update Servers - Proxmox - Prod - Pmx01') {
211-
steps {
212-
script {
213-
catchError(buildResult: 'FAILURE', stageResult: 'SUCCESS') {
214-
build(
215-
job: 'Update_Servers',
216-
parameters: [
217-
string(name: 'SERVER_TYPE', value: 'Proxmox'),
218-
string(name: 'ENV', value: 'prod'),
219-
string(name: 'SERVERS', value: 'pmx01')
220-
],
221-
propagate: true,
222-
wait: true
223-
)
224-
}
225-
}
226-
}
227-
}
228132
}
229133
}

Ansible/ansible.cfg

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
[defaults]
2+
force_color = True
3+
host_key_checking = False
4+
interpreter_python = "/usr/bin/python3"
5+
inventory = inventory
6+
roles_path = roles
7+
retry_files_enabled = False

Ansible/inventories/Antiz.fr/dev

Lines changed: 0 additions & 1 deletion
This file was deleted.

Ansible/inventories/Antiz.fr/prod

Lines changed: 0 additions & 1 deletion
This file was deleted.

Ansible/inventories/Proxmox/prod

Lines changed: 0 additions & 9 deletions
This file was deleted.

Ansible/inventories/Rasp/prod

Lines changed: 0 additions & 2 deletions
This file was deleted.

Ansible/inventories/Template/inventory

Lines changed: 0 additions & 1 deletion
This file was deleted.

Ansible/inventories/VMs/dev

Lines changed: 0 additions & 13 deletions
This file was deleted.

0 commit comments

Comments
 (0)