Provide a libvirt hypervisor for the fleet

During the internal Easter-eggs weekly meeting the idea was submitted. It was approved on principle and Easter-eggs will provide the following hypervisor:

        - cpu : Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz
        - ram : 64 Go
        - cm : X9DRD-iF
        - HDD : 1*ST3000DM001-1CH1 (3TB) 7200 rpm
        - SSD : 2*INTEL SSDSC2BA10 (100GB)

As well as a range of public IPv4 (16 is possible). Easter-eggs provides the following services:

  • Monitoring of the hardware
  • Disaster recovery in case the machine goes down, including replacing it entirely
  • Backup (except for the Qemu images).

@rlaguerre could be the link between the Hostea project and the larger Easter-eggs sysadmin team who is not directly involved in the Hostea project. The next action could be:

  • @dachary Hostea side: keep working on implementing the task list to allow for the integration of the hypervisor
  • @rlaguerre Easter-eggs side: create and followup tasks in the internal issue tracking to mirror the relevant tasks in the Hostea issue tracker

This will be an interesting showcase of how to implement transparency when it involves non publicly available resources (the internal Easter-eggs issue tracker). I think the best course of action would be to just copy/paste the content of the internal Easter-eggs issue tracker to https://gitea.hostea.org/. It means the internal issue tracker must not contain any confidential information, which should not be too difficult.

The associated expenses (hardware, time spent by Raphael and others) are also published as explained in the revenue sharing system. Diligently accounting for time spent is an essential part of the daily activity within the Easter-eggs sysadmin team, it will therefore also be a matter of copy/pasting.

This copy/pasting will be a little tedious and may be automated in the future. But for this first time it should not be too much.

To be continued!

2 Likes

An issue was created to account for the time spent by Easter-eggs when working on providing a libvirt hypervisor for the fleet.

1 Like

@rlaguerre regarding the configuration:

  • 16 IPv4 will be good enough for the foreseeable future
  • disk space: as much as possible, the minimum would be 500GB
  • No need for IPv6

I’m glad you did find a redundant power supply for the hardware :+1:

@rlaguerre regarding the firewall (incoming), the following are necessary on the hypervisor:

  • http, https, ssh, 2222

It is no possible to impose restrictions on outgoing ports because a Gitea / GitLab instance may be set to wget from an unconventional port. This is common practice when self-hosting because people do not have public IPv4 to spare.

The port 2222 is used to login the VM running the Gitea/Gitlab instance. The port 22 is reserved for git.

@rlaguerre it’s great that the hypervisor is mostly ready :+1: Here is what comes to mind to prepare it for an integration to Gna!

  • Create an LVM volume and mount it to /var/lib/libvirt
  • Install libvirt
  • Create a user dedicated to Enough
  • Add the public ssh key of the Gna! controller so that virsh --connect qemu+ssh://enough@gna-hyp-01.pec.cst.easter-eggs.com/system?keyfile=/home/loic/.enough/l.gna.org/infrastructure_key list works
  • Create the l.gna.org instance to act as the Enough instance managing the hypervisor
  • Store ~/.enough/l.gna.org in https://gitea.gna.org/Hostea/gna-hyp-01-enough

And Enough can take care of the rest.

References:

Blocked by this issue:

But the network seems to be properly configured.

@rlaguerre there is one thing we forgot about: solving the problem that the hypervisor IP is in the range allocated to the VMs. Libvirt will route the IP range to the bridge created for the VMs and this will interfere with the default route of the hypervisor.

Ideally the hypervisor would have an IP address that is completely unrelated to the IP range allocated to the VMs and there will be no issue. Would that be possible?

Implementing backups is going to be non-trivial using the libvirt API alone. I think it would be easier and low maintenance to setup the Enough backup logic on the new hypervisor.

@rlaguerre permissions were granted to the enough user to write in the /var/lib/libvirt/images directory:

root@gna-hyp-01:/var/lib/libvirt/images# ls -lRa .
.:
total 12
drwxrwx--x 3 root libvirt 4096 Oct 15 17:10 .
drwxr-xr-x 8 root root    4096 Oct 10 15:00 ..
drwxrwxr-x 2 root libvirt 4096 Oct 15 17:10 enough

./enough:
total 8
drwxrwxr-x 2 root libvirt 4096 Oct 15 17:10 .
drwxrwx--x 3 root libvirt 4096 Oct 15 17:10 ..
root@gna-hyp-01:/var/lib/libvirt/images# id enough
uid=112(enough) gid=65534(nogroup) groups=65534(nogroup),121(libvirt)

The enough user was made a member of the kvm group.

root@gna-hyp-01:/var/lib/libvirt/images# id enough
uid=112(enough) gid=65534(nogroup) groups=65534(nogroup),106(kvm),121(libvirt)
``

@rlaguerre now that Enough is capable of running commands on the hypervisor, the firewall (maybe?) problem shows up in the same way it did when we tried manually.

loic@tulipe:~$ enough --debug --domain l.gna.org host create try-host
libvirt: QEMU Driver error : Domain not found: no domain with matching name 'try-host'
try-host: building image
Warning: Permanently added 'gna-hyp-01.pec.cst.easter-eggs.com,37.9.143.2' (ECDSA) to the list of known hosts.
Warning: Permanently added 'gna-hyp-01.pec.cst.easter-eggs.com,37.9.143.2' (ECDSA) to the list of known hosts.
Warning: Permanently added 'gna-hyp-01.pec.cst.easter-eggs.com,37.9.143.2' (ECDSA) to the list of known hosts.
[  40.7] Downloading: http://builder.libguestfs.org/debian-11.xz
######################################################################## 100.0%-=O=-                                  #     ######################################################################## 100.0%
[  68.9] Planning how to build this image
[  68.9] Uncompressing
[  72.0] Converting raw to qcow2
[  72.9] Opening the new disk
libguestfs: warning: current user is not a member of the KVM group (group ID 106). This user cannot access /dev/kvm, so libguestfs may run very slowly. It is recommended that you 'chmod 0666 /dev/kvm' or add the current user to the KVM group (you might need to log out and log in again).
[ 103.2] Setting a random seed
virt-builder: warning: random seed could not be set for this type of guest
[ 103.4] Running: apt-get --allow-releaseinfo-change update
[ 153.5] Installing packages: sudo
Err:1 http://security.debian.org/debian-security bullseye-security InRelease
Temporary failure resolving 'security.debian.org'
Err:2 http://deb.debian.org/debian bullseye InRelease
Temporary failure resolving 'deb.debian.org'
Err:3 http://deb.debian.org/debian bullseye-updates InRelease
Temporary failure resolving 'deb.debian.org'
Reading package lists...
W: Failed to fetch http://deb.debian.org/debian/dists/bullseye/InRelease  Temporary failure resolving 'deb.debian.org'
W: Failed to fetch http://security.debian.org/debian-security/dists/bullseye-security/InRelease  Temporary failure resolving 'security.debian.org'
W: Failed to fetch http://deb.debian.org/debian/dists/bullseye-updates/InRelease  Temporary failure resolving 'deb.debian.org'
W: Some index files failed to download. They have been ignored, or old ones used instead.
Err:1 http://deb.debian.org/debian bullseye InRelease
Temporary failure resolving 'deb.debian.org'
Err:2 http://security.debian.org/debian-security bullseye-security InRelease
Temporary failure resolving 'security.debian.org'
Err:3 http://deb.debian.org/debian bullseye-updates InRelease
Temporary failure resolving 'deb.debian.org'
Reading package lists...
W: Failed to fetch http://deb.debian.org/debian/dists/bullseye/InRelease  Temporary failure resolving 'deb.debian.org'
W: Failed to fetch http://security.debian.org/debian-security/dists/bullseye-security/InRelease  Temporary failure resolving 'security.debian.org'
W: Failed to fetch http://deb.debian.org/debian/dists/bullseye-updates/InRelease  Temporary failure resolving 'deb.debian.org'
W: Some index files failed to download. They have been ignored, or old ones used instead.
Reading package lists...
Building dependency tree...
Reading state information...
The following NEW packages will be installed:
sudo
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 1059 kB of archives.
After this operation, 4699 kB of additional disk space will be used.
Err:1 http://deb.debian.org/debian bullseye/main amd64 sudo amd64 1.9.5p2-3
Temporary failure resolving 'deb.debian.org'
E: Failed to fetch http://deb.debian.org/debian/pool/main/s/sudo/sudo_1.9.5p2-3_amd64.deb  Temporary failure resolving 'deb.debian.org'
E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?
virt-builder: error:
export DEBIAN_FRONTEND=noninteractive
apt_opts='-q -y -o Dpkg::Options::=--force-confnew'
apt-get $apt_opts update
apt-get $apt_opts install 'sudo'
: command exited with an error

If reporting bugs, run virt-builder with debugging enabled and include the
complete output:

virt-builder -v -x [...]


  RAN: /usr/bin/ssh -i /root/.enough/l.gna.org/infrastructure_key enough@gna-hyp-01.pec.cst.easter-eggs.com 'virt-builder' 'debian-11' '--no-cache' '--output' '/var/lib/libvirt/images/enough/l.gna.org/debian-11.qcow2' '--format' 'qcow2' '--size' '6G' '--run-command' 'apt-get --allow-releaseinfo-change update' '--install' 'sudo' '--root-password' 'disabled' '--run-command' 'dpkg-reconfigure --frontend=noninteractive openssh-server' '--run-command' 'useradd -s /bin/bash -m debian || true ; echo "debian ALL=(ALL) NOPASSWD:ALL" > /etc/sudoers.d/90-debian'

  STDOUT:
[  40.7] Downloading: http://builder.libguestfs.org/debian-11.xz
[  68.9] Planning how to build this image
[  68.9] Uncompressing
[  72.0] Converting raw to qcow2
[  72.9] Opening the new disk
[ 103.2] Setting a random seed
virt-builder: warning: random seed could not be set for this type of guest
[ 103.4] Running: apt-get --allow-releaseinfo-change update
[ 153.5] Installing packages: sudo


  STDERR:
Traceback (most recent call last):
  File "/opt/venv/lib/python3.9/site-packages/cliff/app.py", line 410, in run_subcommand
    result = cmd.run(parsed_args)
  File "/opt/venv/lib/python3.9/site-packages/cliff/display.py", line 115, in run
    column_names, data = self.take_action(parsed_args)
  File "/opt/venv/lib/python3.9/site-packages/enough/cli/host.py", line 21, in take_action
    r = e.host.create_or_update()
  File "/opt/venv/lib/python3.9/site-packages/enough/common/host.py", line 70, in create_or_update
    return lv.create_or_update([name])[name]
  File "/opt/venv/lib/python3.9/site-packages/enough/common/libvirt.py", line 206, in create_or_update
    r[name] = self._create_or_update(
  File "/opt/venv/lib/python3.9/site-packages/enough/common/libvirt.py", line 158, in _create_or_update
    self.image_builder()
  File "/opt/venv/lib/python3.9/site-packages/enough/common/libvirt.py", line 276, in image_builder
    return self._image_builder(self.image_name())
  File "/opt/venv/lib/python3.9/site-packages/enough/common/libvirt.py", line 251, in _image_builder
    self.shell(
  File "/opt/venv/lib/python3.9/site-packages/enough/common/libvirt.py", line 89, in shell
    return sh.Command(self.ssh[0])(*ssh_args, **self.sh_args)
  File "/opt/venv/lib/python3.9/site-packages/sh.py", line 1427, in __call__
    return RunningCommand(cmd, call_args, stdin, stdout, stderr)
  File "/opt/venv/lib/python3.9/site-packages/sh.py", line 774, in __init__
    self.wait()
  File "/opt/venv/lib/python3.9/site-packages/sh.py", line 792, in wait
    self.handle_command_exit_code(exit_code)
  File "/opt/venv/lib/python3.9/site-packages/sh.py", line 815, in handle_command_exit_code
    raise exc
sh.ErrorReturnCode_1: 

  RAN: /usr/bin/ssh -i /root/.enough/l.gna.org/infrastructure_key enough@gna-hyp-01.pec.cst.easter-eggs.com 'virt-builder' 'debian-11' '--no-cache' '--output' '/var/lib/libvirt/images/enough/l.gna.org/debian-11.qcow2' '--format' 'qcow2' '--size' '6G' '--run-command' 'apt-get --allow-releaseinfo-change update' '--install' 'sudo' '--root-password' 'disabled' '--run-command' 'dpkg-reconfigure --frontend=noninteractive openssh-server' '--run-command' 'useradd -s /bin/bash -m debian || true ; echo "debian ALL=(ALL) NOPASSWD:ALL" > /etc/sudoers.d/90-debian'

  STDOUT:
[  40.7] Downloading: http://builder.libguestfs.org/debian-11.xz
[  68.9] Planning how to build this image
[  68.9] Uncompressing
[  72.0] Converting raw to qcow2
[  72.9] Opening the new disk
[ 103.2] Setting a random seed
virt-builder: warning: random seed could not be set for this type of guest
[ 103.4] Running: apt-get --allow-releaseinfo-change update
[ 153.5] Installing packages: sudo

@rlaguerre now that you have changed the DNS and that virt-sysprep runs on the hypervisor, a virtual machine was created. One step forward :tada:

There still is an issue though: the vm cannot obtain an IP from the libvirt network.

root@gna-hyp-01:/home/eeadmin# virsh console try-host
Connected to domain 'try-host'
Escape character is ^] (Ctrl + ])
[   77.519293] firstboot.sh[355]: DHCPDISCOVER on enp2s0 to 255.255.255.255 port 67 interval 19
try-host login: 

Could this be an issue with the firewall preventing communication between the vm and dnsmasq?

@dachary, you were right I added a rule to allow dhcp requests. The VM can now obtain IPs

1 Like

And now it obtains an IP from the expected range :tada: one step forward.

[   4.9] Installing firstboot command: env PORT=22 ROUTED=enp1s0 NOT_ROUTED=enp2s0 UNCONFIGURED=noname bash -x /root/network.sh
try-host: creating host

Starting install...
Domain creation completed.
try-host: waiting for ipv4 to be allocated
Libvirt.get_ipv4: interfaceAddresses returned {}, Retrying in 1 seconds...
Libvirt.get_ipv4: interfaceAddresses returned {}, Retrying in 2 seconds...
Libvirt.get_ipv4: interfaceAddresses returned {}, Retrying in 4 seconds...
Libvirt.get_ipv4: interfaceAddresses returned {}, Retrying in 8 seconds...
try-host: waiting for 37.9.143.6:22 to come up
Check if SSH is available on 37.9.143.6:22
SSH.wait_for_ssh: [Errno 113] No route to host, Retrying in 1 seconds...

The firewall is probably blocking some more and preventing communication with port 22. Before going into this, it would simplify the setup if the hypervisor itself did not use an IP from the 37.9.143.x range. So that firewall rules specific to 37.9.143.x for the purpose of allowing access to the VMs are independent of the firewall rules that relate to the hypervisor.

What do you think?

@rlaguerre I see just now that you propose to not change the IP of the hypervisor and fix the problem by adding routes, which is also fine, of course. I just don’t know how to do that but if you do, I’ll learn something in the process :slight_smile:

Nice guide discovered by @rlaguerre on libvirt and networking :+1:

https://wiki.libvirt.org/page/VirtualNetworking