Operations grimoire/NetBox: Difference between revisions
No edit summary |
|||
Line 66: | Line 66: | ||
== Troubleshoot == | == Troubleshoot == | ||
=== Grafana === | |||
Use [https://grafana.nasqueron.org/d/O6v4rMpizda/netbox?orgId=1&refresh=10s Grafana dashboard] to check statistics. | |||
As of 2024-08-04, all the rates are erroneous and (a lot) over-evaluated. | |||
=== 500 after server reboot === | === 500 after server reboot === | ||
After the server is restarted, NetBox needs PostgreSQL and Redis to work. | After the server is restarted, NetBox needs PostgreSQL and Redis to work. |
Latest revision as of 16:10, 4 August 2024
NetBox is the current source of truth for the network. It's available at NetBox.
Integration with Salt
Ongoing integration projects are in work to serve NetBox data as part of Salt pillar for configuration as code.
It should serve:
- /etc/hosts data for Drake network machines
- the same information available at pillar/nodes/nodes.sls, so we won't need to maintain that anymore
Integration with IRC
Export to a database for Odderon is discussed at T1868.
That should allow the following workflow:
17:13:59 < Dereckson> 172.27.27.6 17:14:00 < Odderon> (Dereckson): 172.27.27.6/28 docker-001.nasqueron.org / Decommissioned address for docker-001.nasqueron.org / Drake [deprecated]
Source code of the translation code is available at https://devcentral.nasqueron.org/source/netbox-darkbot-db/
Backup
To backup the PostgreSQL database:
sudo -u postgres pg_dump netbox | gzip > netbox-$(date +"%Y-%m-%d").gz
Database is currently hosted on WindRiver, not on DB-A, even if a user exists there and an outdated copy exists.
Upgrade
To upgrade NetBox:
- Backup database on WindRiver
- Install the new version
- Copy configuration
- Update Python dependencies
- Shutdown the service:
service netbox stop
or if pid is wrongkillall netbox
- Symlink the directory in /srv/netbox
- Run the database migrations and regenerate static resources, etc. as documented in upgrade.sh
- Start the service
An example for upgrade to 3.7.0:
sudo -u postgres pg_dump netbox | gzip > netbox-$(date +"%Y-%m-%d").gz cd /srv/netbox wget https://github.com/netbox-community/netbox/archive/refs/tags/v3.7.0.tar.gz tar xzf v3.7.0.tar.gz cp /srv/netbox/netbox/netbox/netbox/configuration.py netbox-3.7.0/netbox/netbox/ source venv/bin/activate pip install -r netbox-3.7.0/requirements.txt killall -u netbox rm netbox ln -s netbox-3.7.0 netbox cd /srv/netbox/netbox mkdocs build cd /srv/netbox/netbox/netbox python3 manage.py migrate python3 manage.py collectstatic python3 manage.py remove_stale_contenttypes python3 manage.py reindex --lazy python3 manage.py clearsessions service netbox start
To track configuration changes, you can use https://github.com/netbox-community/netbox/commits/develop/netbox/netbox/configuration_example.py
If psycopg can't build, you can build it with pip install "psycopg[c]"
. That requires development headers for PostgreSQL, available on all FreeBSD servers where PostgreSQL is installed ; for a Docker image based on Linux, search for a libpq-dev or libpq-devel package for that purpose.
Documentation needs to be generated before collectstatic. If not, run again the collectstatic command to copy docs from project-static to static directory.
Troubleshoot
Grafana
Use Grafana dashboard to check statistics.
As of 2024-08-04, all the rates are erroneous and (a lot) over-evaluated.
500 after server reboot
After the server is restarted, NetBox needs PostgreSQL and Redis to work.
- If PostgreSQL is down, we don't have any message on the web UI, it serves directly a 500 error code
- If PostgreSQL up, Redis down, an explicit error message tells us it can't connect to Redis
- When both services are up, site should work again
Conventions
Devices
Device role colors
- Network appliances and assimilated (e.g. machines with router role): gray
- Devices for human to connect to and work from (e.g. shellserver, devserver): purple
- Devices hosting services (e.g. dbserver, paas-docker): pink