Author | Nejat Hakan |
nejat.hakan@outlook.de | |
PayPal Me | https://paypal.me/nejathakan |
Network/Server Monitoring LibreNMS
LibreNMS is a powerful, open-source, and feature-rich network monitoring system that automatically discovers, polls, and graphs data from a wide range of network hardware and operating systems. This guide provides a comprehensive walkthrough from basic setup to advanced customization, designed for university students and aspiring IT professionals who want to master self-hosted network monitoring. Each section includes theoretical knowledge followed by practical, hands-on workshops.
Introduction to LibreNMS
Welcome to the world of LibreNMS! Before we dive into the technical intricacies of installation and configuration, it's essential to understand what LibreNMS is, what it can do for you, and why it has become a popular choice for network monitoring in organizations of all sizes. This introductory section will lay the groundwork for your journey with LibreNMS.
What is LibreNMS?
LibreNMS is a fully featured, open-source network monitoring system (NMS) that provides a wealth of information about your network infrastructure and servers. It is a community-driven fork of Observium, and its development is active and robust. At its core, LibreNMS uses the Simple Network Management Protocol (SNMP) to poll devices, but it also supports other methods for data collection. It can monitor a vast array of device types, including routers, switches, firewalls, servers (Linux, Windows, etc.), printers, and even specialized hardware like UPS systems and environmental sensors.
The primary goal of LibreNMS is to provide a centralized platform for IT administrators and network engineers to gain visibility into the health, performance, and availability of their IT assets. It achieves this by collecting metrics, storing them in a time-series database (typically RRDtool), and presenting them through a user-friendly web interface with graphs, dashboards, and alerting capabilities.
Key characteristics of LibreNMS include:
- Auto-discovery: LibreNMS can automatically discover devices on your network based on various protocols like SNMP, CDP (Cisco Discovery Protocol), LLDP (Link Layer Discovery Protocol), OSPF, and BGP.
- Extensibility: It supports a wide range of operating systems and network hardware out-of-the-box and can be extended to support new devices and applications through custom MIBs (Management Information Bases) and scripts.
- Alerting System: A flexible alerting system allows you to define rules and receive notifications through various channels (email, Slack, Telegram, etc.) when specific conditions are met (e.g., a device is down, CPU usage is high).
- Distributed Polling: For larger networks, LibreNMS supports distributed pollers to scale monitoring capabilities across multiple locations or network segments.
- API Access: A comprehensive API allows for integration with other systems and automation of tasks.
Key Features and Benefits
LibreNMS offers an impressive suite of features that make it a compelling choice for network monitoring:
- Automatic Discovery: As mentioned, it can automatically discover your network devices, reducing manual configuration effort.
- Wide Device Support: Supports a vast range of network hardware and operating systems from numerous vendors. This includes common metrics like CPU, memory, storage, network interface traffic, temperature, and voltage.
- Customizable Dashboard: Allows users to create personalized dashboards displaying the most relevant information at a glance.
- Flexible Alerting: Highly configurable alerting system with support for various notification methods and customizable alert templates. You can set up alerts based on thresholds, device status, specific OIDs, and more.
- Graphing: Generates detailed historical graphs for collected metrics, enabling trend analysis and capacity planning. It primarily uses RRDtool for this purpose.
- Billing System: Includes a traffic accounting system that can be used to monitor and bill bandwidth usage for specific ports or customers.
- Integration: Integrates with other tools and services like Oxidized (for network device configuration backup), Smokeping (for latency monitoring), and various authentication backends (LDAP, RADIUS, Active Directory).
- Open Source and Community Driven: Being open source means it's free to use, modify, and distribute. The active community provides support, contributes new features, and ensures the software stays up-to-date.
- User-Friendly Web Interface: Provides an intuitive web UI for managing devices, viewing data, and configuring the system.
Benefits of using LibreNMS:
- Improved Network Visibility: Gain a clear understanding of what's happening on your network in real-time.
- Proactive Problem Detection: Identify and address issues before they impact users or services.
- Reduced Downtime: Faster detection and diagnosis of problems lead to quicker resolution and minimized downtime.
- Capacity Planning: Historical data and trend analysis help in planning for future growth and resource allocation.
- Cost-Effective: Being open source, it eliminates licensing fees associated with commercial NMS solutions.
- Customization: Tailor the monitoring setup to your specific needs and environment.
Architecture Overview
Understanding the basic architecture of LibreNMS helps in troubleshooting and extending its capabilities. A typical LibreNMS setup consists of several key components:
- LibreNMS Core Application: This is the PHP-based application that provides the web interface, device discovery logic, polling engine, and alerting system.
- Web Server: A web server like Nginx or Apache is required to serve the LibreNMS web interface. It processes HTTP requests and passes PHP requests to the PHP interpreter.
- PHP Interpreter: LibreNMS is written in PHP, so a PHP interpreter (PHP-FPM or mod_php) is essential to execute the application code.
- Database Server: LibreNMS uses a MySQL or MariaDB database to store device information, configuration settings, event logs, and some performance data (though most time-series data is in RRD files).
- SNMP Tools: Utilities like
snmpget
,snmpwalk
(part ofnet-snmp
orsnmp
) are used by the poller to query SNMP-enabled devices. - RRDtool: This is a crucial component used to store and display time-series data (e.g., CPU usage, network traffic) in Round Robin Database (RRD) files. RRDtool is responsible for generating the graphs you see in LibreNMS.
- Poller (
poller.php
): A PHP script, typically run via cron, that queries devices for data at regular intervals (usually every 5 minutes). - Discovery (
discovery.php
): Another PHP script, run via cron, that automatically discovers new devices and updates information about existing devices (e.g., new interfaces, changed IP addresses). - Scheduler (
cronic
andlaravel-scheduler
): LibreNMS uses a scheduler to manage the execution of various background tasks, including polling, discovery, alerting, and housekeeping.
Data Flow:
- The discovery process finds devices and their capabilities.
- The poller queries these devices for performance metrics via SNMP (or other methods).
- Collected data is stored:
- Device metadata and some states go into the SQL database.
- Time-series performance data is stored in RRD files.
- The web server and PHP application present this data to the user through the web interface.
- The alerting system checks defined rules against the collected data and sends notifications if necessary.
In larger environments, Distributed Pollers can be set up. These are separate instances of the poller running on different machines, which then report data back to the central LibreNMS server. This helps distribute the polling load and allows monitoring of devices in remote or isolated network segments.
Why Self-Host LibreNMS?
While SaaS (Software as a Service) monitoring solutions exist, self-hosting LibreNMS offers several distinct advantages, particularly for those who want greater control, customization, and often, cost savings in the long run:
- Complete Control: You have full control over the server, data, and configuration. This means you can customize every aspect of the monitoring system to fit your exact needs, integrate it deeply with your existing infrastructure, and manage security according to your own policies.
- Data Privacy and Security: Your monitoring data, which can be sensitive, stays within your own infrastructure. This is crucial for organizations with strict data sovereignty or compliance requirements.
- No Vendor Lock-in: As an open-source solution, you are not tied to a specific vendor's roadmap, pricing changes, or terms of service. You have the freedom to modify the source code if needed.
- Cost-Effectiveness: While there's an initial setup effort and ongoing maintenance, self-hosting can be significantly cheaper than commercial SaaS solutions, especially as the number of monitored devices grows. There are no per-device or per-sensor licensing fees.
- Learning Opportunity: Self-hosting provides an invaluable learning experience in server administration, network management, database management, and the intricacies of monitoring protocols. This is particularly beneficial for students and IT professionals looking to deepen their skills.
- Customization and Extensibility: LibreNMS is highly extensible. You can write your own pollers, discovery modules, alert templates, and integrate with other internal tools. This level of customization is often not possible with SaaS offerings.
- Offline Access: If your monitoring needs are for an internal network without reliable internet access or if you prefer an air-gapped monitoring solution, self-hosting is the way to go.
However, self-hosting also comes with responsibilities:
- Setup and Configuration: You are responsible for installing and configuring all components.
- Maintenance: This includes OS updates, LibreNMS updates, database backups, and general system upkeep.
- Troubleshooting: You'll need to diagnose and fix any issues that arise.
For many, the benefits of control, customization, and cost savings outweigh these responsibilities, making self-hosted LibreNMS an attractive option.
Workshop Introduction to LibreNMS
Objective: To familiarize yourself with the LibreNMS project, its documentation, and community resources. This workshop does not involve any installation yet but focuses on exploration.
Prerequisites:
- A computer with internet access.
- A web browser.
Tasks:
-
Explore the Official LibreNMS Website:
- Navigate to the official LibreNMS website: https://www.librenms.org/
- Read the "About" section to understand the project's philosophy.
- Browse the "Features" section. Identify three features that you find most compelling and think about how they could be useful in a real-world scenario (e.g., monitoring a university campus network or a small business IT infrastructure).
- Look at the "Screenshots" or "Demo" section to get a visual feel for the interface.
-
Dive into the Documentation:
- Find the official LibreNMS documentation: https://docs.librenms.org/
- Locate the "Installation" section. Briefly look at the different installation methods available (e.g., Ubuntu, CentOS, Docker). Don't worry about understanding all the details yet.
- Find the "Supported Devices" page. Are there any devices or operating systems listed that you are familiar with or have access to? This will be helpful for later workshops.
- Skim through the "Alerting" section to get an idea of how alerts are configured.
-
Check out Community Resources:
- LibreNMS has a strong community. Find links to their community forum (e.g., on community.librenms.org) or Discord/IRC channels.
- Visit the forum and browse a few recent topics. Notice the types of questions being asked and the support provided by the community. This will be a valuable resource if you encounter issues later.
- Explore the LibreNMS GitHub repository: https://github.com/librenms/librenms
- Look at the "Issues" tab. This can give you an idea of ongoing development, bug reports, and feature requests.
- Check the "Pull Requests" tab to see contributions being made to the project.
- Note the activity level (e.g., last commit date, number of contributors) to gauge the health of the project.
-
Consider a Use Case:
- Think about a hypothetical (or real) environment you might want to monitor. This could be:
- Your home network (router, PCs, a Raspberry Pi).
- A small lab environment with a few virtual machines.
- A department's IT resources at a university.
- List 3-5 key things you would want to monitor in this environment (e.g., server uptime, router bandwidth, disk space on a file server). How might LibreNMS help you achieve this?
- Think about a hypothetical (or real) environment you might want to monitor. This could be:
Deliverables/Reflection (for your own notes):
- A list of three compelling LibreNMS features and their potential applications.
- Notes on the general structure of the LibreNMS documentation and where to find key information.
- An impression of the LibreNMS community and its activity.
- A brief description of a potential use case for LibreNMS relevant to you.
This exploratory workshop will give you a solid conceptual foundation before you start with the hands-on installation and configuration in the subsequent sections. Understanding the "what" and "why" will make the "how" much more meaningful.
Basic LibreNMS Setup and Configuration
This section guides you through the essential first steps of getting a LibreNMS instance up and running. We will cover system requirements, different installation approaches, the initial web-based configuration, and finally, adding your very first device to monitor. By the end of this section, you will have a functional LibreNMS server polling data from at least one host.
1. System Requirements and Prerequisites
Before you begin installing LibreNMS, it's crucial to ensure your environment meets the necessary hardware and software requirements. Proper planning at this stage can save you a lot of trouble later on.
Hardware Recommendations
LibreNMS can run on a variety of hardware, from a Raspberry Pi (for very small setups) to powerful dedicated servers. The required resources depend heavily on the number of devices and sensors you plan to monitor.
- CPU:
- Small (1-50 devices): 1-2 CPU cores (e.g., modern Raspberry Pi 4, small VM).
- Medium (50-200 devices): 2-4 CPU cores. A reasonably modern dual-core or quad-core processor.
- Large (200+ devices): 4+ CPU cores. Faster cores are generally better than more slow cores for poller performance. For very large deployments (1000+ devices), consider 8+ cores and investigate distributed pollers.
- Memory (RAM):
- Small: At least 2GB RAM is recommended. While it might run on less, performance will suffer, especially with the web UI and database.
- Medium: 4GB - 8GB RAM.
- Large: 8GB+ RAM. More RAM helps with database caching and PHP processes.
- Storage:
- Type: SSDs (Solid State Drives) are highly recommended for the operating system, LibreNMS application, database, and especially for RRD files. RRD files involve frequent small writes, and SSDs significantly improve I/O performance, impacting graph generation and poller speed.
- Capacity: This depends on the number of devices, ports, sensors, and the data retention period for RRD files.
- OS and LibreNMS application: 20-30GB is usually sufficient.
- RRD files: This is the largest consumer. A rough estimate:
- A device with many ports and sensors can generate 50-200MB of RRD data per year.
- For 100 devices, averaging 100MB each, you'd need 10GB per year of RRD storage.
- Start with at least 50-100GB for RRDs and monitor disk usage. It's easier to expand storage later than to run out.
- Database: The SQL database stores metadata, not the bulk time-series data. It typically grows much slower than RRD storage. 10-20GB is often ample for a long time for medium setups.
- Network: A stable network connection with sufficient bandwidth is required, especially if monitoring many devices or remote devices. A 1 Gbps network interface is standard.
Virtualization:
LibreNMS runs very well in a virtual machine (VM). This offers benefits like easy snapshots, resource scalability, and hardware independence. Ensure your hypervisor provides good I/O performance to the VM, especially for storage.
Software Dependencies
LibreNMS has several software dependencies. The exact versions might change, so always consult the official LibreNMS documentation for the most current requirements.
- Operating System:
- Primarily developed and tested on Linux. Recommended distributions:
- Ubuntu Server (LTS releases, e.g., 20.04, 22.04): Often preferred due to good community support and up-to-date packages.
- Debian (Stable releases): Similar to Ubuntu, very stable.
- CentOS Stream / RHEL / Rocky Linux / AlmaLinux: Also well-supported.
- It's crucial to use a 64-bit OS.
- A minimal server installation is generally preferred to avoid unnecessary software and potential conflicts.
- Primarily developed and tested on Linux. Recommended distributions:
- Web Server:
- Nginx: Highly recommended for performance and efficiency.
- Apache: Also supported.
- PHP:
- Specific PHP versions are required (e.g., PHP 7.4, 8.1, 8.2 – check docs for current).
- Numerous PHP extensions are needed:
gd
,mysql
,snmp
,xml
,mbstring
,tokenizer
,json
,zip
,curl
,bcmath
,intl
,ldap
(optional),opcache
(recommended for performance).
- Database Server:
- MariaDB (version 10.5+ recommended) or MySQL (version 5.7/8.0+ recommended). MariaDB is often favored in the LibreNMS community. Ensure
innodb_file_per_table
is enabled.
- MariaDB (version 10.5+ recommended) or MySQL (version 5.7/8.0+ recommended). MariaDB is often favored in the LibreNMS community. Ensure
- SNMP Utilities:
net-snmp
package (providessnmpget
,snmpwalk
, etc.) is essential for SNMP polling.snmpd
(SNMP daemon) if you want to monitor the LibreNMS server itself via SNMP.
- Other Tools:
git
: For downloading and updating LibreNMS.composer
: PHP dependency manager, used to install PHP libraries.rrdtool
: For storing and graphing time-series data.fping
orping
: For ICMP reachability checks.fping
is generally preferred for its ability to ping multiple hosts efficiently.python3-memcached
(andmemcached
server): Optional but highly recommended for caching, which improves performance.cron
: For scheduling regular tasks like polling and discovery.whois
: For some discovery features.mtr-tiny
: For traceroute functionality.imagemagick
: For some image manipulations.Nmap
: For host discovery and port scanning.
It's critical to install the correct versions of these dependencies, especially PHP and its extensions, as mismatches can lead to installation failures or runtime errors. The official LibreNMS installation guides usually provide precise commands to install all necessary dependencies for supported operating systems.
Network Considerations
- Connectivity: The LibreNMS server must have network connectivity to all devices it intends to monitor. This means:
- IP reachability (can it ping the devices?).
- SNMP traffic (UDP port 161 by default) must be allowed from the LibreNMS server to the monitored devices. Firewalls on the devices themselves, or network firewalls between LibreNMS and the devices, must be configured accordingly.
- DNS: Reliable DNS resolution is important. The LibreNMS server should be able to resolve hostnames of monitored devices if you plan to add them by hostname. Conversely, devices might need to resolve the LibreNMS server's hostname for certain features (like syslog forwarding).
- NTP (Network Time Protocol): Ensure the LibreNMS server and all monitored devices have their time synchronized using NTP. Time discrepancies can cause issues with data correlation, graph display, and log analysis.
- IP Addressing: The LibreNMS server itself should have a static IP address or a DHCP reservation.
Workshop System Preparation
Objective:
To prepare a virtual machine or physical server environment that meets the basic software prerequisites for a LibreNMS installation on Ubuntu Server 22.04 LTS. This workshop focuses on setting up the OS and initial packages, not LibreNMS itself.
Prerequisites:
- Ability to create a new Virtual Machine (using VirtualBox, VMware, Hyper-V, KVM, etc.) or access to a spare physical machine.
- An Ubuntu Server 22.04 LTS 64-bit ISO image.
- Internet access for the server.
Tasks:
-
Install Ubuntu Server 22.04 LTS:
- Create a new VM with the following minimum specifications (you can increase these if you have more resources):
- CPU: 2 cores
- RAM: 4 GB
- Storage: 50 GB (SSD if possible)
- Network: Bridged or NAT with port forwarding (ensure it can access the internet and you can access it from your host machine).
- Boot from the Ubuntu Server 22.04 LTS ISO.
- Follow the installation prompts:
- Choose your language.
- Select keyboard layout.
- Choose "Ubuntu Server" (not "minimized").
- Configure networking (DHCP is fine for now, but note the IP address. For a real server, a static IP is recommended).
- Configure proxy if needed.
- Use default mirror.
- For partitioning, "Use an entire disk" and "Set up this disk as an LVM group" is a good default for flexibility. Accept the defaults or customize if you are familiar with partitioning.
- Confirm destructive action.
- Set up your profile: Your name, server's name (e.g.,
librenms-server
), username, and a strong password. - Important: Choose to "Install OpenSSH server" for remote access.
- Do not install any of the "Featured Server Snaps" at this stage (like Docker, Nextcloud, etc.). We will install components manually.
- Wait for the installation to complete, then reboot and remove the installation media.
- Create a new VM with the following minimum specifications (you can increase these if you have more resources):
-
Initial Server Configuration:
- Log in to your new Ubuntu server, either directly or via SSH (using the IP address obtained during installation and the username/password you set).
- Update the package list and upgrade existing packages:
- Set the system timezone (replace
Your/Timezone
with your actual timezone, e.g.,Europe/Berlin
orAmerica/New_York
). You can list timezones withtimedatectl list-timezones
. - Install basic utilities that are often useful:
sudo apt install -y curl wget vim git software-properties-common apt-transport-https ca-certificates
curl
,wget
: For downloading files.vim
: A text editor (or usenano
if you prefer).git
: For version control, needed for LibreNMS.software-properties-common
,apt-transport-https
,ca-certificates
: For managing software repositories, especially PPAs or third-party repos.
-
Consider User Management (Optional but Recommended):
- The user you created during installation has
sudo
privileges. For LibreNMS, you'll typically run commands as this user or create a dedicatedlibrenms
user later. For now, using your sudo-enabled user is fine.
- The user you created during installation has
Verification:
- You should be able to log in via SSH.
- The system should be up-to-date (
sudo apt update && sudo apt list --upgradable
should show no or few upgrades). - The timezone should be correctly set.
This prepared server is now a clean slate, ready for the specific dependencies LibreNMS requires, which we will cover in the installation workshop. Having a standardized, updated base OS is crucial for a smooth installation process.
2. Installation Methods
LibreNMS offers a few ways to get it installed on your prepared server. The most common and well-documented method is a manual installation, which gives you the most control and understanding of the components. Docker is another popular option for those comfortable with containerization.
Manual Installation Steps (e.g., on Ubuntu/Debian or CentOS/RHEL)
Manual installation involves installing each component (web server, PHP, database, LibreNMS code) step-by-step. While this might seem more involved, it provides a deeper understanding of how LibreNMS works and integrates with the system. The official LibreNMS documentation provides detailed, distribution-specific guides. Here's a conceptual overview of the typical steps involved for an Ubuntu/Debian based system. Always refer to the official documentation for the latest and most precise commands.
General Phases of Manual Installation:
-
Install Dependencies:
- Web Server (Nginx or Apache): Install the chosen web server. Nginx is generally recommended.
- Database Server (MariaDB or MySQL): Install and secure the database server.
- PHP and Extensions: Install the required PHP version (e.g.,
php8.1-fpm
) and all necessary PHP extensions listed in the LibreNMS requirements (php-cli
,php-mysql
,php-snmp
,php-xml
,php-gd
,php-mbstring
,php-curl
,php-zip
, etc.).# Example for PHP 8.1 and extensions on Ubuntu sudo add-apt-repository ppa:ondrej/php # For latest PHP versions sudo apt update sudo apt install php8.1-fpm php8.1-cli php8.1-mysql php8.1-snmp php8.1-xml php8.1-gd php8.1-mbstring php8.1-curl php8.1-zip php8.1-bcmath php8.1-intl php8.1-gmp # and others as per docs
- Other Tools: Install
snmp
,rrdtool
,fping
,git
,composer
,imagemagick
,mtr-tiny
,nmap
, etc.
-
Create LibreNMS User and Database:
- Create a dedicated system user for LibreNMS (e.g.,
librenms
). - Log into MariaDB/MySQL and create a database and user for LibreNMS. Grant appropriate privileges.
- Create a dedicated system user for LibreNMS (e.g.,
-
Download LibreNMS Code:
- Clone the LibreNMS repository from GitHub into a directory like
/opt/librenms
.
- Clone the LibreNMS repository from GitHub into a directory like
-
Set Permissions and Ownership:
- Set the correct ownership and permissions for the LibreNMS directories and files, as specified in the documentation. This typically involves giving the
librenms
user and the web server user (e.g.,www-data
) appropriate access.# Example, consult official docs for exact commands sudo chown -R librenms:librenms /opt/librenms sudo setfacl -d -m g::rwx /opt/librenms/rrd /opt/librenms/logs /opt/librenms/bootstrap/cache/ /opt/librenms/storage/ sudo setfacl -R -m g::rwx /opt/librenms/rrd /opt/librenms/logs /opt/librenms/bootstrap/cache/ /opt/librenms/storage/ # Specific permissions for web server user might also be needed
- Set the correct ownership and permissions for the LibreNMS directories and files, as specified in the documentation. This typically involves giving the
-
Install PHP Dependencies:
- Run
composer install
within the LibreNMS directory to download required PHP libraries. This needs to be done as thelibrenms
user or with appropriate permissions.
- Run
-
Configure Web Server:
- Configure Nginx or Apache to serve the LibreNMS web interface. This involves creating a virtual host configuration file that points to the LibreNMS public directory (
/opt/librenms/html
) and configures PHP processing. - The LibreNMS documentation provides sample configuration files for Nginx and Apache.
- Enable the new site configuration and restart the web server.
# Example: copy provided nginx config, enable it, test, and reload nginx sudo cp /opt/librenms/misc/librenms.nonroot.nginx /etc/nginx/sites-available/librenms # Edit /etc/nginx/sites-available/librenms to set server_name and PHP socket path sudo ln -s /etc/nginx/sites-available/librenms /etc/nginx/sites-enabled/ sudo rm /etc/nginx/sites-enabled/default # Remove default site if it conflicts sudo nginx -t # Test configuration sudo systemctl restart nginx
- Configure Nginx or Apache to serve the LibreNMS web interface. This involves creating a virtual host configuration file that points to the LibreNMS public directory (
-
Configure PHP-FPM (if using Nginx):
- Ensure PHP-FPM is configured correctly, particularly the user it runs as and the listen socket path, which must match the Nginx configuration. Adjust settings in
/etc/php/8.1/fpm/pool.d/www.conf
(or a dedicated LibreNMS pool). - Common changes include setting
user
andgroup
tolibrenms
, and configuring thelisten
directive.
- Ensure PHP-FPM is configured correctly, particularly the user it runs as and the listen socket path, which must match the Nginx configuration. Adjust settings in
-
Configure
snmpd
(Optional, for monitoring LibreNMS server itself):- If you want LibreNMS to monitor the server it's running on, configure the local SNMP daemon (
snmpd
). Copy the providedsnmpd.conf
from LibreNMS, customize it (especially the community string), and restartsnmpd
.
- If you want LibreNMS to monitor the server it's running on, configure the local SNMP daemon (
-
Set up Cron Jobs:
- LibreNMS relies on cron jobs for polling, discovery, and other background tasks. Copy the provided cron file (e.g.,
/opt/librenms/librenms.cron
) to/etc/cron.d/librenms
. - This cron job typically runs every minute and uses LibreNMS's internal scheduler to manage tasks.
- LibreNMS relies on cron jobs for polling, discovery, and other background tasks. Copy the provided cron file (e.g.,
-
Perform Web-Based Installation:
- Access the LibreNMS web interface in your browser (e.g.,
http://your_server_ip_or_hostname
). This should redirect you to the installer (install.php
). - Follow the on-screen instructions, which will check prerequisites, configure database settings (you'll enter the database name, user, and password created earlier), and create an admin user.
- Access the LibreNMS web interface in your browser (e.g.,
This manual process is detailed but gives you full insight. Always follow the official LibreNMS installation guide for your specific OS, as commands and paths can vary.
Docker Installation (Brief Overview and Pros/Cons)
LibreNMS also provides official Docker images, which can simplify deployment, especially if you are already using Docker.
How it works:
The Docker setup typically involves using docker-compose
to orchestrate multiple containers:
- A container for the LibreNMS application itself (with PHP and web server).
- A container for the MariaDB/MySQL database.
- Optionally, containers for
memcached
,RRDcached
, or distributed pollers.
Persistent data (database files, RRD files, configuration) is usually stored in Docker volumes to survive container restarts.
Pros of Docker Installation:
- Simplified Deployment: Fewer manual steps to install dependencies on the host system. Dependencies are managed within the containers.
- Isolation: LibreNMS and its components run in isolated environments, reducing conflicts with other applications on the host.
- Portability: Easier to move the LibreNMS setup between different Docker hosts.
- Reproducibility: Dockerfiles and
docker-compose.yml
define the environment, making setups more consistent. - Easier Upgrades (Potentially): Upgrading can sometimes be as simple as pulling a new image version and restarting containers (though database migrations and data integrity still need care).
Cons of Docker Installation:
- Learning Curve: Requires familiarity with Docker and
docker-compose
concepts. - Abstraction: Can make troubleshooting more complex if issues occur within a container, as you're dealing with an extra layer of abstraction.
- Resource Overhead: Docker itself introduces some resource overhead, though usually minimal.
- Networking Complexity: Docker networking can be tricky, especially when integrating with existing network infrastructure or setting up distributed pollers that need to communicate with devices outside the Docker network.
- Customization: While possible, customizing the Dockerized setup (e.g., adding specific PHP extensions not in the official image) might require building your own custom images.
The official LibreNMS Docker repository is at https://github.com/librenms/docker. It provides docker-compose.yml
examples and instructions.
Which method to choose?
- Manual Installation: Recommended if you want maximum control, a deep understanding of the system, or if you are not yet comfortable with Docker. It's also the most thoroughly documented method in the main LibreNMS docs.
- Docker Installation: A good choice if you are experienced with Docker, want rapid deployment, or prefer containerized applications.
For this guide, we will focus on the manual installation method in the workshop to provide a foundational understanding.
Workshop Installing LibreNMS (Manual Method on Ubuntu 22.04)
Objective: To perform a manual installation of LibreNMS on the Ubuntu Server 22.04 LTS prepared in the previous workshop. We will use Nginx, MariaDB, and PHP 8.1.
Prerequisites:
- The Ubuntu Server 22.04 LTS system prepared in the "Workshop System Preparation."
- Root or sudo access to this server.
- Internet connectivity on the server.
- You should have noted the IP address of your server.
Important Note: These instructions are based on common practices and LibreNMS documentation at the time of writing. Always cross-reference with the latest official LibreNMS installation guide for Ubuntu/Debian, as package names, commands, or recommended versions might change: https://docs.librenms.org/Installation/Install-LibreNMS/
Steps:
-
Install Essential Packages (Web Server, Database, PHP, Utilities):
- Log in to your Ubuntu server via SSH.
- Add the PHP PPA for up-to-date PHP versions (if not already done, though official Ubuntu 22.04 repos have PHP 8.1):
- Install Nginx, MariaDB, PHP 8.1 and its extensions, and other required tools. This is a comprehensive list; some might already be installed from the previous workshop.
sudo apt install -y acl curl composer fping git graphviz imagemagick mariadb-client mariadb-server mtr-tiny nginx-full nmap python3-pymysql python3-dotenv python3-redis python3-setuptools python3-pip python3-memcache rrdtool snmp snmpd whois unzip \ php8.1-cli php8.1-fpm php8.1-curl php8.1-gd php8.1-gmp php8.1-intl php8.1-mbstring php8.1-mysql php8.1-snmp php8.1-xml php8.1-zip php8.1-memcached php8.1-bcmath
acl
: For file access control lists.composer
: PHP dependency manager.fping
: For quick ICMP checks.graphviz
: For some graph rendering.imagemagick
: Image processing.python3-pymysql
,python3-dotenv
,python3-redis
,python3-setuptools
,python3-pip
,python3-memcache
: Python components used by LibreNMS scripts or for integrations.
-
Configure MariaDB:
- Start and enable MariaDB:
- Run the secure installation script (set a root password, remove anonymous users, disallow remote root login, remove test database): (Answer 'Y' to most questions. Set a strong root password when prompted.)
- Log in to MariaDB as root:
- Create the LibreNMS database and user. Replace
StrongPasswordHere
with a very strong, unique password. - Modify MariaDB configuration for LibreNMS compatibility. Edit
/etc/mysql/mariadb.conf.d/50-server.cnf
(or a similar file likemy.cnf
): Under the[mysqld]
section, add or modify these lines:innodb_file_per_table=1 lower_case_table_names=0 # As per recent LibreNMS recommendations for consistency sql_mode="" # Clear sql_mode to avoid strict mode issues
innodb_file_per_table=1
: Important for InnoDB performance and manageability.lower_case_table_names=0
: LibreNMS generally expects case-sensitive table names.sql_mode=""
: Some older LibreNMS versions or specific OS defaults for SQL mode could cause issues; clearing it is often safer.
- Restart MariaDB:
-
Create LibreNMS User and Download Code:
- Create the
librenms
system user: - Clone LibreNMS into
/opt/librenms
:
- Create the
-
Set Permissions:
- This is a critical step. Follow the official documentation carefully.
sudo chown -R librenms:librenms /opt/librenms sudo chmod 771 /opt/librenms # Allow librenms user and group to write sudo setfacl -d -m g::rwx /opt/librenms/rrd /opt/librenms/logs /opt/librenms/bootstrap/cache/ /opt/librenms/storage/ sudo setfacl -R -m g::rwx /opt/librenms/rrd /opt/librenms/logs /opt/librenms/bootstrap/cache/ /opt/librenms/storage/
- These commands ensure the
librenms
user and its group can manage files, and also set default ACLs so new files/directories inherit correct permissions. The web server user will need to be added to thelibrenms
group.
- These commands ensure the
- This is a critical step. Follow the official documentation carefully.
-
Configure PHP and Install PHP Dependencies:
- Configure PHP Timezone. Find your PHP-FPM config (e.g.,
/etc/php/8.1/fpm/php.ini
) and PHP-CLI config (e.g.,/etc/php/8.1/cli/php.ini
): Find thedate.timezone
setting, uncomment it, and set it to your timezone (e.g.,date.timezone = Europe/Berlin
). - Restart PHP-FPM:
- Install PHP dependencies using Composer (run as the
librenms
user or ensure correct permissions afterwards):This step can take a few minutes as it downloads many libraries.cd /opt/librenms # Run composer_wrapper.php as the librenms user: sudo su - librenms -c './scripts/composer_wrapper.php install --no-dev' # If the above fails due to sudo/path issues, you might need to run it as root and then fix ownership: # sudo ./scripts/composer_wrapper.php install --no-dev # sudo chown -R librenms:librenms /opt/librenms/vendor /opt/librenms/composer.json /opt/librenms/composer.lock
- Configure PHP Timezone. Find your PHP-FPM config (e.g.,
-
Configure Web Server (Nginx):
- Add the web server user (
www-data
for Nginx/Apache on Debian/Ubuntu) to thelibrenms
group: - Configure the Nginx virtual host for LibreNMS:
sudo cp /opt/librenms/misc/librenms.nonroot.nginx /etc/nginx/sites-available/librenms # Edit the file to set your server_name and check PHP socket: sudo vim /etc/nginx/sites-available/librenms
- Change
server_name librenms.example.com;
toserver_name your_server_ip_or_hostname;
(e.g.,server_name 192.168.1.100;
). - Ensure the
fastcgi_pass
directive points to the correct PHP-FPM socket, usuallyunix:/run/php/php8.1-fpm.sock
.
- Change
- Enable the new site and remove the default:
- Test Nginx configuration and restart Nginx:
- Add the web server user (
-
Configure
snmpd
(for monitoring the LibreNMS server itself):- Copy the example
snmpd.conf
: - Edit the file and change
RANDOMSTRINGGOESHERE
to a secure, unique community string (e.g.,librenmscommunity
). - Download the
distro
script which helps LibreNMS identify the OS: - Restart and enable
snmpd
:
- Copy the example
-
Set up Cron Job:
- Copy the LibreNMS cron job file:
- Ensure the cron daemon is running:
-
Perform Web-Based Installation:
- Open your web browser and navigate to
http://your_server_ip_or_hostname
. You should be redirected toinstall.php
. - Follow the on-screen instructions:
- Stage 0 (Checks): All checks should be green (or yellow for optional items). If there are red items, you must fix them before proceeding.
- Stage 1 (DB Connection):
- DB User:
librenms
- DB Password:
StrongPasswordHere
(the one you set) - DB Name:
librenms
- DB User:
- Click "Next Stage". If successful, it will say "DB Schema: OK".
- Stage 2 (Create Admin User):
- Create your admin username, password, and email.
- Click "Finish install".
- You may see a message "The poller has not run recently." This is normal at first. The cron job will pick it up.
- You might be prompted to run
./scripts/database-schema.sh
if the schema isn't fully up-to-date. If so, run aslibrenms
user: - The final step might involve running daily.sh once manually to ensure everything is set up:
- Open your web browser and navigate to
Verification:
- You should be able to log in to the LibreNMS web interface with the admin credentials you created.
- Navigate to
Gear Icon > Validate Config
. It should show that your config is OK. If not, address the reported issues. - Wait 5-10 minutes for the cron jobs to run. The message about pollers not running should disappear.
- The LibreNMS server itself (localhost) might be automatically added as a device if
snmpd
is correctly configured and discovery runs.
Congratulations! You have manually installed LibreNMS. This is a significant achievement and provides a solid foundation for network monitoring. The process is involved, but it ensures you understand each component's role.
3. Initial Configuration
Once LibreNMS is installed and you can access the web interface, there are several initial configuration steps to take to make it fully operational and tailored to your environment. This usually involves going through settings in the web UI.
Web-based Setup Wizard
As seen in the installation workshop, the very first interaction after the base files are in place is often the web-based setup wizard (install.php
). This wizard guides you through:
- Pre-flight Checks: Verifies that all required PHP extensions are present, file permissions are correct, and other environmental prerequisites are met. It's crucial to resolve any errors reported at this stage.
- Database Configuration: Prompts for the database connection details (hostname, username, password, database name) that you created during the manual setup. It then attempts to connect to the database and set up the necessary schema.
- Admin User Creation: Allows you to create the first administrative user account for LibreNMS, which you'll use to log in and manage the system.
If you completed the manual installation workshop, you've already gone through this. If for some reason the web installer didn't run or needs to be re-run (e.g., after a fresh git clone before database setup), accessing http://your-librenms-host/install.php
would typically trigger it.
Database Configuration
While the web wizard handles the initial connection, sometimes you need to review or modify the database configuration. The primary database settings are stored in the .env
file in the root of your LibreNMS installation (e.g., /opt/librenms/.env
).
This file is critical and contains sensitive information like your database password.
A typical .env
file would have entries like:
APP_KEY=base64:someRandomString...
DB_HOST=localhost
DB_DATABASE=librenms
DB_USERNAME=librenms
DB_PASSWORD=your_strong_password
# ... other settings
DB_HOST
: Usuallylocalhost
if the database is on the same server.DB_DATABASE
: The name of the LibreNMS database (e.g.,librenms
).DB_USERNAME
: The database user for LibreNMS.DB_PASSWORD
: The password for the database user.
If you ever need to change your database password or migrate the database to a different host, you would update this .env
file and then potentially clear cached configurations:
# As librenms user or with appropriate permissions in /opt/librenms
php artisan config:clear
php artisan cache:clear
.env
file.
Creating the First Admin User
This is typically done during the web-based setup wizard. If you need to create additional admin users or manage users, you can do so from within the LibreNMS web interface after logging in:
- Navigate to Gear Icon (Top Right) > Users.
- Here you can:
- Add User:
Create new user accounts. You can assign them different access levels (e.g., Normal User, Global Read, Administrator). - Manage Existing Users:
Edit user details, change passwords, or delete users.
- Add User:
User levels define what a user can see and do within LibreNMS:
- Normal User:
Can only see devices they are permitted to see (via device group permissions or if the device is marked as public). Can acknowledge alerts for permitted devices. - Global Read:
Can see all devices and most information but cannot make changes. - Administrator:
Full access to all settings and devices.
For initial setup, the admin user created during the wizard is sufficient.
Basic System Settings
After the initial installation, you should review and configure several global settings. Access these via Gear Icon (Top Right) > Global Settings.
Key areas to review:
-
System Configuration (
System > General
):- Base URL: Ensure this is correctly set to the URL you use to access LibreNMS (e.g.,
http://librenms.example.com
). This is important for links in emails and integrations. - fping / fping6 Location: Verify paths to
fping
andfping6
executables. These are usually auto-detected. - RRDtool Version: Select the RRDtool version installed on your system.
- Enable Syslog: If you plan to send syslog messages from your devices to LibreNMS for centralized logging. This requires additional configuration on both LibreNMS and your devices.
- Base URL: Ensure this is correctly set to the URL you use to access LibreNMS (e.g.,
-
Polling Settings (
System > Poller
):- Poller Modules: Enable or disable specific poller modules (e.g., OS updates, BGP, OSPF). Only enable modules relevant to your devices to save polling time and resources.
- Discovery Modules: Similar to poller modules, enable or disable discovery modules.
- SNMP Settings: Configure default SNMP version, port, and timeout. You can override these per device.
-
Alerting Settings (
Alerting > General
):- Default "from" email address: Set the email address from which alert notifications will be sent.
- Default contact email address: An email address to receive test alerts or general admin notifications.
-
Authentication (
System > Authentication
):- Configure authentication methods. By default, it uses local database users. You can integrate with LDAP, RADIUS, or Active Directory for centralized user management.
-
Interface Settings (
System > Web UI
):- Customize aspects of the web interface, such as default dashboard pages, themes, and visual elements.
-
Distributed Poller Settings (if applicable):
- If you plan to use distributed pollers, this is where you configure poller groups and related settings. We will cover this in more detail in the advanced section.
Take your time to go through these settings. Many defaults are sensible, but customizing them to your environment from the start is good practice.
Workshop First Steps with LibreNMS
Objective: To log in to your newly installed LibreNMS, explore basic global settings, and validate the installation.
Prerequisites:
- A successfully installed LibreNMS instance from the previous workshop.
- Web browser access to the LibreNMS UI.
- Admin credentials created during the web installation.
Tasks:
-
Log In and Initial Exploration:
- Open your web browser and navigate to your LibreNMS URL (e.g.,
http://your_server_ip_or_hostname
). - Log in with the admin username and password you created.
- You will land on the main dashboard. It will likely be empty or show only "localhost" if
snmpd
was configured on the LibreNMS server and discovery has run.
- Open your web browser and navigate to your LibreNMS URL (e.g.,
-
Validate Configuration:
- In the top right corner, click the Gear Icon.
- From the dropdown menu, select Validate Config.
- Review the output. Ideally, it should say:
==================================== Component | Version --------- | ------- LibreNMS | x.x.x (e.g., 23.11.0) DB Schema | xxx PHP | 8.1.x Python | 3.x.x MySQL | 10.6.x-MariaDB RRDTool | 1.7.x SNMP | NET-SNMP 5.9.x ==================================== [OK] Composer Version: 2.x.x [OK] Dependencies up-to-date. [OK] Database connection successful [OK] Database schema correct [INFO] Detected Python Wrapper Version 31
- If there are any [FAIL] or [WARN] messages, they will usually provide information or links to documentation on how to resolve them. Common issues might relate to file permissions, PHP settings, or cron jobs not running correctly. Address these before proceeding further. For instance, if it complains about daily.sh not running, you might need to run it manually once:
-
Review Global Settings:
- Click the Gear Icon > Global Settings.
- System > General:
- Verify
Base URL for Web UI
. If it'shttp://localhost
and you access it via IP or hostname, change it to the correct URL (e.g.,http://192.168.1.100
). This is important for links in alerts. - Note the
Installation ID
. This is unique to your install.
- Verify
- System > Poller > SNMP Settings:
- Observe the default SNMP settings: version (e.g.,
v2c
), port (161
), default community strings (oftenpublic
). We will use these when adding devices.
- Observe the default SNMP settings: version (e.g.,
- System > Web UI > General:
- Explore options like
Default front page
. You might change this later once you have favorite dashboards.
- Explore options like
- Alerting > General:
- Set
Default "from" email address
to something likelibrenms@yourdomain.com
(even if you haven't configured mail sending yet, it's good practice). - Set
Default contact email address
to your own email address.
- Set
- Click Save Settings at the bottom of any page where you make changes.
-
Check System Health and Logs:
- Navigate to Health (Heart Icon in top menu) > Poller Performance. This shows statistics about your poller runs. Initially, it might not have much data.
- Navigate to Logs > Event Log. This log shows significant events in LibreNMS, such as devices being added, going down, or alerts being triggered.
- Navigate to Logs > LibreNMS Log. This provides more detailed logging from the LibreNMS application itself, useful for troubleshooting.
Deliverables/Reflection:
- Confirmation that your LibreNMS configuration is valid.
- Familiarity with the Global Settings menu and key options.
- Understanding where to find basic health information and logs.
This workshop ensures your LibreNMS installation is sound and configured with some basic, essential parameters. You are now ready to start adding devices and seeing LibreNMS in action.
4. Adding Your First Device
The core function of LibreNMS is to monitor devices. This subchapter will cover the fundamentals of SNMP, how to configure it on common operating systems, and then how to add a device to LibreNMS for monitoring.
Understanding SNMP (v1, v2c, v3)
Simple Network Management Protocol (SNMP) is the primary protocol LibreNMS uses to communicate with and gather data from network devices and servers. Understanding its basics is crucial.
-
What is SNMP? SNMP is an application-layer protocol defined by the Internet Architecture Board (IAB) for exchanging management information between network devices. It is part of the TCP/IP protocol suite. SNMP enables network administrators to manage network performance, find and solve network problems, and plan for network growth.
-
Key Components of SNMP:
- Managed Devices: These are network elements like routers, switches, servers, printers, or any device that runs SNMP agent software.
- SNMP Agent: Software that runs on managed devices. It collects and stores management information locally and responds to SNMP queries from an NMS.
- Network Management System (NMS): Software that runs on a management station (like your LibreNMS server). The NMS queries agents, receives responses, and presents data to users.
- Management Information Base (MIB): A hierarchical database of information that describes the manageable aspects of a device. Each piece of information (e.g., CPU load, interface traffic counter) is represented by an Object Identifier (OID). MIBs define these OIDs and their meaning, data type, and access permissions (read-only or read-write).
- Standard MIBs (e.g., MIB-II) are common across most devices.
- Vendor-specific (enterprise) MIBs provide information unique to a particular vendor's hardware or software. LibreNMS includes support for many common MIBs.
- Object Identifier (OID): A unique, numeric identifier used to specify a managed object in the MIB tree. For example,
.1.3.6.1.2.1.1.1.0
is the OID forsysDescr.0
(system description).
-
SNMP Operations:
- GET: The NMS retrieves the value of one or more OIDs from an agent.
- GETNEXT: The NMS retrieves the value of the OID following the one specified. This is used to "walk" a MIB tree.
- SET: The NMS changes the value of an OID on an agent (if it's read-write and permitted). LibreNMS primarily uses read operations.
- TRAP: Asynchronous notifications sent from an agent to the NMS to indicate a significant event (e.g., an interface going down, a device rebooting). LibreNMS can receive and process traps.
-
SNMP Versions:
- SNMPv1: The original version. Simple but lacks strong security. Uses "community strings" (plain text passwords) for authentication. Prone to security risks.
- Community String: A shared password between the NMS and the agent.
- Read-Only (RO) Community: Allows the NMS to read data.
- Read-Write (RW) Community: Allows the NMS to read and modify data (use with extreme caution).
- Community String: A shared password between the NMS and the agent.
- SNMPv2c (Community-based SNMPv2): The most widely used version. It offers improvements over v1, such as enhanced error handling and new data types (e.g., 64-bit counters, important for high-speed interfaces). However, it still uses community strings for security, making it vulnerable if not properly managed (e.g., using weak or default community strings like "public" or "private").
- SNMPv3: The most secure version. It provides:
- Authentication: Verifies the identity of the sender (NMS or agent) using usernames and authentication protocols (MD5 or SHA).
- Encryption (Privacy): Encrypts SNMP messages to prevent eavesdropping, using protocols like DES, 3DES, or AES.
- Message Integrity: Ensures messages haven't been tampered with during transit. SNMPv3 uses a User-based Security Model (USM) and a View-based Access Control Model (VACM). Configuration is more complex than v1/v2c but offers significantly better security. It's highly recommended for production environments.
- SNMPv1: The original version. Simple but lacks strong security. Uses "community strings" (plain text passwords) for authentication. Prone to security risks.
LibreNMS supports all three versions. For initial learning and internal trusted networks, SNMPv2c is often used due to its simplicity. However, moving to SNMPv3 is a best practice for security.
Configuring SNMP on a Linux Host
To monitor a Linux server with LibreNMS, you need to install and configure an SNMP agent on it. The most common SNMP agent for Linux is net-snmp
(package name might be snmpd
or net-snmp
).
Steps for Ubuntu/Debian:
-
Install SNMP Agent:
-
Configure
snmpd
:- The main configuration file is typically
/etc/snmp/snmpd.conf
. It's recommended to back up the original before editing. - Agent Address: By default,
snmpd
might only listen on127.0.0.1
(localhost). You need it to listen on an IP address reachable by your LibreNMS server (e.g., all interfaces or a specific interface IP).- Find the line starting with
agentaddress
or similar. If it'sagentaddress 127.0.0.1,[::1]
, change it to listen on all IPv4 and IPv6 interfaces, or a specific IP:
- Find the line starting with
- Configure Community String (for SNMPv2c):
- The
rocommunity
directive defines a read-only community string. - Syntax:
rocommunity <community_string> [source] [OID]
- For basic setup, you can define a community string and restrict it to your LibreNMS server's IP address for better security.
It's good practice to use a non-default community string (i.e., not "public").
# Replace 'YourCommunityString' with a strong, unique string # Replace 'librenms_server_ip' with the actual IP of your LibreNMS server rocommunity YourCommunityString librenms_server_ip # Example: # rocommunity MySecretSNMPv2c 192.168.1.100 # If you want to allow any source (less secure, for testing only): # rocommunity YourCommunityString default -V systemonly # Limits to system MIB group # rocommunity YourCommunityString # Allows access to full MIB tree from anywhere (NOT RECOMMENDED for production)
- The
- System Information: You can set system location and contact information, which LibreNMS can pick up.
- Extend
snmpd
(Optional, for more data): LibreNMS provides ansnmpd.conf
example that includesextend
scripts for additional monitoring capabilities (like OS updates, disk I/O, etc.). You can downloaddistro
script for OS detection:And then add lines like this to yoursudo curl -o /usr/bin/distro https://raw.githubusercontent.com/librenms/librenms-agent/master/snmp/distro sudo chmod +x /usr/bin/distro
snmpd.conf
:LibreNMS documentation often has more# This allows LibreNMS to detect the OS distribution extend .1.3.6.1.4.1.2021.7890.1 distro /usr/bin/distro
extend
examples or specific agent setup scripts.
- The main configuration file is typically
-
Restart
snmpd
Service: -
Firewall Configuration:
- If you are using a firewall (like
ufw
on Ubuntu), you need to allow incoming SNMP traffic (UDP port 161) from your LibreNMS server.
- If you are using a firewall (like
-
Test SNMP Locally and Remotely:
- Locally (on the Linux host being configured):
- Remotely (from the LibreNMS server):
Configuring SNMPv3 is more involved, requiring user creation, authentication, and privacy protocols. We will touch upon this in the advanced security section. For now, SNMPv2c is sufficient for learning.
Configuring SNMP on a Windows Host
Windows Servers also have an SNMP service that can be enabled.
Steps for Windows Server (e.g., 2016, 2019, 2022):
-
Install SNMP Service Feature:
- Open Server Manager.
- Click Manage > Add Roles and Features.
- Proceed to the Features section.
- Select SNMP Service. You might also want SNMP WMI Provider if it's listed as an option.
- Click Add Features if prompted for management tools.
- Complete the installation.
-
Configure SNMP Service:
- Open Services (
services.msc
). - Find the SNMP Service, right-click, and select Properties.
- Agent Tab:
- Fill in
Contact
andLocation
if desired. - Under
Service
, check all options (Physical, Applications, Datalink and subnetwork, Internet, End-to-end). This determines what MIB data is exposed.
- Fill in
- Traps Tab (Optional):
- If you want the Windows server to send SNMP traps to LibreNMS, you can configure trap destinations here. Enter a
Community name
and add your LibreNMS server's IP address toTrap destinations
.
- If you want the Windows server to send SNMP traps to LibreNMS, you can configure trap destinations here. Enter a
- Security Tab: This is the most important tab for LibreNMS polling.
- Accepted community names: Click Add...
- Choose the
Community rights
(e.g.,READ ONLY
). - Enter your desired
Community Name
(e.g.,YourWindowsCommunityString
). This must match what you configure in LibreNMS. - Click Add.
- Choose the
- Accept SNMP packets from these hosts:
- Select this option for better security.
- Click Add... and enter the IP address of your LibreNMS server.
- Click Add.
- (Alternatively, "Accept SNMP packets from any host" is less secure but can be used for initial testing).
- Accepted community names: Click Add...
- Click Apply and OK.
- Open Services (
-
Restart SNMP Service:
- In the Services console, right-click SNMP Service and select Restart.
- Ensure the service
Startup Type
is set toAutomatic
.
-
Windows Firewall:
- Open Windows Defender Firewall with Advanced Security.
- Go to Inbound Rules.
- Find the rules for SNMP Service (UDP-In). There might be predefined rules.
- Ensure the rule is Enabled.
- Double-click the rule and go to the Scope tab.
- Under Remote IP address, you can restrict access to the IP address of your LibreNMS server for better security. Select "These IP addresses", click "Add...", and enter the LibreNMS server's IP.
- If no predefined rule exists, create a new Inbound Rule:
- Rule Type:
Port
- Protocol and Ports:
UDP
, Specific local ports:161
- Action:
Allow the connection
- Profile: Select appropriate profiles (Domain, Private, Public - usually Domain and Private).
- Name:
SNMP Allow (UDP 161 for LibreNMS)
- Optionally, scope it to the LibreNMS server IP.
- Rule Type:
-
Test SNMP Remotely:
- From your LibreNMS server's command line:
Adding a Device via LibreNMS Web UI
Once SNMP is configured on the target device and reachable from the LibreNMS server, you can add it in the LibreNMS web interface.
- Log in to LibreNMS.
- Navigate to Devices > Add Device.
-
Fill in the device details:
- Hostname or IP Address: Enter the IP address or resolvable hostname of the device you want to add (e.g.,
192.168.1.50
orlinux-server1.yourdomain.local
). - SNMP Version: Select the SNMP version configured on the device (e.g.,
v2c
orv3
). - Port: Default is
161
. Change if your device uses a non-standard SNMP port. - Transport: Default is
UDP
. - SNMP Community / Auth Details:
- If using SNMPv1 or v2c: Enter the SNMP Community string you configured on the device.
- If using SNMPv3: You'll need to provide:
Auth Level
(e.g.,authPriv
for authentication and encryption,authNoPriv
for authentication only,noAuthNoPriv
)Auth Username
Auth Protocol
(MD5 or SHA)Auth Password
Privacy Protocol
(DES, AES) - if usingauthPriv
Privacy Password
- if usingauthPriv
- Poller Group: (Advanced) For now, leave as default (group 0).
- Force add even if ICMP / SNMP check fails: Usually leave unchecked. If checked, LibreNMS will add the device even if it can't immediately ping it or get SNMP data, which can be useful if you know the device will come online later or has ICMP blocked.
- Attempt to use a pre-set SysDescr / OS: (Advanced) Usually leave as default.
- Hostname or IP Address: Enter the IP address or resolvable hostname of the device you want to add (e.g.,
-
Click Add Device.
Verifying Device Addition and Data Polling
After clicking "Add Device":
- LibreNMS will attempt to contact the device using ICMP (ping) and then SNMP.
- If successful, the device will be added, and you'll be taken to its overview page. You should see a message like "Device added successfully."
- Initially, many graphs will be empty or show "No data." LibreNMS polls devices at regular intervals (typically every 5 minutes via the cron job). You need to wait for a few polling cycles (5-15 minutes) for data to start appearing.
- Check Device Overview: The device's overview page will start populating with information like System Uptime, OS, Hardware, CPU, Memory, and Storage details as data is collected.
- Check Graphs: Click on the "Graphs" tab for the device. You should see graphs for CPU, memory, network interfaces, etc., begin to show data points after a few polling cycles.
- Check Event Log: Go to Logs > Event Log. You should see entries related to the new device being discovered, and pollers collecting data. Search for the hostname or IP of the device you added.
- Look for messages like "Device snmp reachable" or "Polled in X.XX seconds".
- If you see errors like "SNMP error" or "Timeout: No Response from..." then there's a problem with SNMP communication (wrong community string, firewall blocking, SNMP agent not running or misconfigured on the device).
- Device Status: On the device overview page, the status should eventually show as "Up."
If the device doesn't appear or shows errors:
- Double-check the SNMP configuration on the target device (community string, listening address, allowed hosts/IPs).
- Verify firewall rules on the target device and any network firewalls between LibreNMS and the device.
- Use
snmpwalk
from the LibreNMS server's command line to test SNMP connectivity to the target device, using the same IP, version, and community string/credentials you entered in LibreNMS. This is the most direct way to troubleshoot SNMP issues. - Check the LibreNMS logs (Logs > Event Log and Logs > LibreNMS Log) for more detailed error messages.
Workshop Monitoring Your First Linux Server
Objective: To configure SNMP on a Linux server (this could be the LibreNMS server itself or another Linux VM/host) and add it to LibreNMS for monitoring.
Prerequisites:
- Your LibreNMS installation is up and running.
- Access to a Linux server to be monitored. For simplicity, we can use the LibreNMS server itself, as we configured
snmpd
on it during the installation workshop. If you used a different community string for the LibreNMS server's ownsnmpd
than the default "public", make a note of it. If you haven't configuredsnmpd
on the LibreNMS server yet, follow the steps in "Configuring SNMP on a Linux Host" above forlocalhost
. - Alternatively, set up another Linux VM (e.g., Ubuntu Server) and configure
snmpd
on it as described above, ensuring it's reachable from your LibreNMS server and firewalls are adjusted.
Let's assume you will monitor the LibreNMS server itself (localhost).
Tasks:
-
Verify
snmpd
on the LibreNMS Server (localhost):- During the installation workshop, we configured
snmpd
on the LibreNMS server. The default community string in/opt/librenms/snmpd.conf.example
is oftenRANDOMSTRINGGOESHERE
. Let's assume you changed this tolibrenmscommunity
(or use whatever you set). - Log in to your LibreNMS server via SSH.
- Test
snmpd
locally: You should see output. If not, review/etc/snmp/snmpd.conf
on the LibreNMS server, ensure the community string is correct,agentaddress
is listening (e.g.,udp:161
), and restartsnmpd
(sudo systemctl restart snmpd
).
- During the installation workshop, we configured
-
Add "localhost" (the LibreNMS server) to LibreNMS Monitoring:
- Open the LibreNMS web interface.
- Navigate to Devices > Add Device.
- Hostname:
localhost
(LibreNMS will resolve this to 127.0.0.1 for polling). - SNMP Version:
v2c
. - SNMP Community:
librenmscommunity
(or the community string you configured forsnmpd
on the LibreNMS server). - Leave other fields at their default values unless you know you need to change them.
- Click Add Device.
-
Observe Device Discovery and Polling:
- You should be redirected to the device page for
localhost
. Initially, it might say "Device added, awaiting first poll." - Wait a few minutes (up to 5-10 minutes for the first poll and discovery to complete).
- Refresh the page. You should start seeing information populate:
- OS: (e.g., Linux)
- Hardware: (e.g., KVM, VMware Virtual Platform, or physical hardware info)
- System Uptime
- CPU, Memory, Storage graphs should begin to appear.
- Navigate to the Graphs tab for
localhost
. You should see various graphs. If they say "No data", wait a bit longer. Polling happens every 5 minutes by default. - Check the Event Log (Logs > Event Log, filter by
localhost
if needed). You should see entries indicating successful polling.
- You should be redirected to the device page for
-
(Optional) Add Another Linux Server:
- If you have another Linux VM or physical server:
- Install and configure
snmpd
on it as described in "Configuring SNMP on a Linux Host."- Use a unique community string (e.g.,
anotherlinuxsecret
). - Ensure the
agentaddress
allows connections from your LibreNMS server IP. - Configure the firewall on that Linux server to allow UDP 161 from the LibreNMS server IP.
- Use a unique community string (e.g.,
- Test with
snmpwalk
from the LibreNMS server's command line to this new Linux server's IP: - If
snmpwalk
works, add it in LibreNMS (Devices > Add Device) using its IP address and the community stringanotherlinuxsecret
. - Observe it being polled.
- Install and configure
- If you have another Linux VM or physical server:
Verification:
- The
localhost
device (and any other Linux server you added) should show an "Up" status in LibreNMS. - You should see data populating in the overview and graphs sections for the device(s).
- The Event Log should show successful polling events for the device(s).
Congratulations! You've successfully added your first device(s) to LibreNMS and are now collecting monitoring data. This is the foundational step for all further network monitoring activities. Explore the different tabs (Overview, Health, Graphs, Ports, etc.) for the device you added to see the wealth of information LibreNMS can collect.
Intermediate LibreNMS Management
With LibreNMS installed and your first devices added, it's time to delve deeper into its capabilities. This section covers navigating the interface effectively, managing your devices in more detail, setting up crucial alerts, and understanding how LibreNMS visualizes data. These skills will allow you to transform LibreNMS from a simple data collector into a proactive monitoring tool.
5. Exploring the LibreNMS Interface
The LibreNMS web interface is packed with information. Learning to navigate it efficiently is key to leveraging its power. The UI is generally intuitive, but knowing where to find specific details can save a lot of time.
Dashboard Overview
When you log in, you typically land on a dashboard. LibreNMS allows for multiple customizable dashboards.
- Default Dashboard: The initial dashboard usually provides a high-level overview:
- Device status counts (Up, Down, Ignored, Disabled).
- Recent events from the Event Log.
- System information about the LibreNMS server itself.
- Graphs showing overall network traffic, top interfaces, etc. (if configured).
- Creating Custom Dashboards:
- Navigate to Overview > Dashboards > Manage Dashboards.
- You can create new dashboards tailored to specific needs (e.g., a dashboard for critical servers, another for network core devices).
- When viewing a dashboard, click the "Edit Dashboard" (pencil icon) to add, remove, or rearrange widgets.
- Widgets: These are the building blocks of dashboards. LibreNMS offers a wide variety of widgets:
- Device status summaries
- Specific graphs from any device/port
- Alert tables
- Top X (e.g., top CPU users, top bandwidth consumers)
- Maps (if locations are set)
- Status indicators
- And many more.
- Switching Dashboards: Use the Overview > Dashboards menu to switch between your available dashboards.
Spend time experimenting with dashboards. A well-designed dashboard can give you an immediate snapshot of your network's health.
Navigating Devices, Ports, and Graphs
-
All Devices Page (
Devices > All Devices
):- This is your central list of all monitored devices.
- It's a sortable and searchable table showing hostname, IP address, OS, uptime, status, and often some key performance indicators.
- You can filter devices by various criteria (e.g., OS, device group, hardware type).
- Clicking on a device hostname takes you to its individual Device Overview page.
-
Device Overview Page:
- This page provides a comprehensive summary for a single device.
- Header: Shows device name, IP, sysName, status, and icons for various actions (Edit, Delete, Rediscover, Graphs, Alerts, etc.).
- Tabs: The information is organized into tabs:
- Overview: General information, key metrics (CPU, Memory, Storage), recent events for this device.
- Health: Detailed health sensors (temperature, fans, power supplies, etc., if supported by the device).
- Graphs: A collection of all standard and custom graphs for this device. You can select specific metrics and time ranges.
- Ports: Lists all network interfaces on the device. For each port, you can see its status, speed, traffic counters, and access detailed graphs.
- VLANs, VRFs, IP Addresses: Network-specific information.
- Routing: BGP, OSPF, EIGRP information if the device is a router and these protocols are polled.
- Inventory: Hardware and software inventory details if supported (e.g., serial numbers, firmware versions).
- Wireless: Information for wireless access points or controllers.
- Load Balancer: Data for load balancing services.
- Applications: Data from application-specific pollers (e.g., Apache, MySQL).
- Logs: Event log, alert log, and syslog specific to this device.
- Alerts: Current active alerts and alert history for this device.
- Edit: Takes you to the device settings page where you can change SNMP community, polling settings, assign to groups, etc.
-
Ports Page (
Networking > Ports
):- Lists all monitored interfaces across all devices.
- Searchable and filterable (e.g., show all down ports, show all 10Gbps ports).
- Clicking a port name takes you to its detailed page with graphs, statistics, and settings.
-
Graphs:
- Graphs are central to LibreNMS. They visualize time-series data stored in RRD files.
- Accessible from device pages, port pages, or directly via Graphs in the main menu (for aggregated or specific graphs).
- Time Range Selection: Most graphs allow you to select the time period (e.g., last hour, day, week, month, year, custom range).
- Zoom and Pan: Interactive graphs often allow zooming into specific periods.
- Data Aggregation: For longer time periods, RRDtool aggregates data (e.g., showing averages instead of raw 5-minute samples) to keep RRD file sizes manageable. This is why graphs might look "smoother" over longer periods.
Logs and Event Management
LibreNMS maintains several logs that are crucial for understanding system behavior and troubleshooting.
-
Event Log (
Logs > Event Log
):- This is a primary log for significant occurrences.
- Records events like:
- Device up/down status changes.
- Port up/down status changes.
- Alerts being triggered or cleared.
- Device discovery and polling successes/failures.
- Configuration changes.
- Filterable by device, event type, severity, and time range.
- Regularly reviewing the Event Log is good practice.
-
Alert Log (
Logs > Alert Log
):- Specifically lists triggered alerts and their history.
- Shows when an alert was raised, acknowledged, and cleared.
-
Syslog (
Logs > Syslog
):- If you configure devices to send syslog messages to LibreNMS, they will appear here.
- LibreNMS can parse these logs and even trigger alerts based on syslog content.
- Requires enabling the syslog receiver in LibreNMS global settings and configuring devices.
-
LibreNMS Log (
/opt/librenms/logs/librenms.log
on the server):- This is the application log file. It contains detailed debugging information, errors, and operational messages from the LibreNMS backend processes (poller, discovery, web UI).
- Accessible via the command line on the server or sometimes through a "LibreNMS Log" viewer in the UI (Logs > LibreNMS Log if enabled and accessible by the web user). This log is invaluable for deep troubleshooting.
Understanding Polling and Discovery
These are two fundamental background processes in LibreNMS.
-
Polling (
poller.php
):- Responsible for actively collecting data from already known devices.
- Runs at regular intervals (default: every 5 minutes) via a cron job.
- For each device, it queries configured SNMP OIDs and other data sources.
- Updates RRD files with new data points.
- Checks device and port statuses.
- Evaluates alert rules based on the newly collected data.
- You can see poller performance statistics under Health > Poller Performance. This helps identify slow-polling devices or overloaded pollers.
- A single poller run should ideally complete within the polling interval (e.g., under 5 minutes). If it takes longer, you might need to optimize, add more pollers (distributed polling), or reduce the number of polled metrics.
-
Discovery (
discovery.php
):- Responsible for finding new devices and updating information about existing ones.
- Auto-discovery: Can scan network ranges (defined in global settings or via
config.php
) for new SNMP-enabled devices. - Neighbor Discovery: Uses protocols like CDP, LLDP, OSPF, BGP to find connected devices based on information from already monitored devices.
- Updates: For existing devices, discovery checks for new interfaces, changes in system description, new hardware sensors, etc.
- Runs less frequently than the poller (default: every 6 hours, but this can be configured).
- You can manually trigger discovery for a specific device from its "Edit" page.
- Newly discovered devices are typically added automatically if auto-discovery is enabled and they respond to configured SNMP communities.
Understanding the distinction and timing of these processes is key to knowing when to expect new data or device updates.
Workshop Navigating and Understanding LibreNMS UI
Objective: To become comfortable navigating the LibreNMS web interface, finding key information about devices, and understanding where to look for logs and system health.
Prerequisites:
- A running LibreNMS instance with at least one device monitored (e.g.,
localhost
from the previous workshop). - Admin access to LibreNMS.
Tasks:
-
Explore the Dashboard:
- Log in to LibreNMS. Observe the default dashboard.
- Identify the "Device Summary" widget. What does it tell you?
- Locate the "Event Log" widget (if present). What kind of events do you see?
- Try to customize the dashboard:
- Click the Pencil Icon (Edit Dashboard) at the top right of the dashboard content area.
- Click Add Widget. Browse the available widgets.
- Add a "Device CPU Usage" graph widget. You might need to select your
localhost
device. - Try adding a "Clock" widget.
- Drag and drop widgets to rearrange them.
- Click Stop Editing (Floppy Disk/Save icon or similar).
-
Navigate to Device Details:
- Go to Devices > All Devices.
- Click on your
localhost
device (or any other device you have added). - On the Device Overview page for
localhost
:- Identify the OS, Hardware, Uptime.
- Look at the CPU, Memory, and Storage sections.
- Click the Graphs tab.
- Find the "CPU Usage" graph. Change the time range (e.g., "Last 6 hours", "Last 24 hours").
- Find a network interface graph (e.g.,
lo
or your main ethernet interface). Observe the traffic.
- Click the Ports tab.
- Identify the network interfaces listed. Click on one (e.g.,
eth0
orens18
orlo
). - Review the port details and its specific graphs.
- Identify the network interfaces listed. Click on one (e.g.,
- Explore other tabs like Health (if your device has sensors like temperature) and Inventory.
-
Examine Logs:
- Go to Logs > Event Log.
- Look for events related to
localhost
being polled or discovered. - Try to filter the log by "Device" and select
localhost
. - Note the timestamps and messages.
- Look for events related to
- Go to Logs > Alert Log. This will likely be empty unless you've configured alerts and one has triggered.
- (If you configured syslog forwarding to LibreNMS and your device is sending syslogs) Go to Logs > Syslog and see if any messages appear.
- Go to Logs > Event Log.
-
Check Poller and Discovery Information:
- Go to Health > Poller Performance.
- Observe the "Overall" poller statistics. Note the "Last Polled" and "Poll Duration."
- Look for your
localhost
device in the list of polled devices. How long did it take to poll?
- Go to Global Settings (Gear Icon) > System > Poller > Discovery settings.
- Note the "Discovery interval" (default is usually 21600 seconds = 6 hours).
- This tells you how often LibreNMS checks for new devices or updates to existing ones.
- Trigger a manual rediscovery for
localhost
:- Go to
localhost
device page. - Click the Cog icon (Edit) on the device header (or find "Edit" in a menu).
- Under Device Settings, find a button or link for Rediscover Device or similar within the "Capture" or "Discovery" debug sections. (UI may vary slightly, sometimes it's under a "Capture" sub-menu for the device).
- Alternatively, from Devices > All Devices, you can select the device and choose "Rediscover" from an action menu.
- After triggering, check the Event Log for discovery messages related to
localhost
.
- Go to
- Go to Health > Poller Performance.
-
Use the Search Function:
- At the top of the LibreNMS interface, there's a search bar.
- Try searching for
localhost
. It should lead you to the device page. - If you had many devices, you could search by IP, hostname, or even parts of sysDescr.
Deliverables/Reflection:
- A custom dashboard with at least two new widgets.
- Ability to locate CPU and network traffic graphs for a specific device and change their time range.
- Understanding of where to find the Event Log and basic poller performance information.
- Successful manual rediscovery of a device.
This workshop aims to build your confidence in using the LibreNMS UI. The more familiar you are with it, the quicker you can diagnose issues and extract valuable insights.
6. Device Management
As your monitored environment grows, effective device management becomes crucial. This involves organizing devices, customizing their polling behavior, managing credentials securely, and efficiently adding devices in bulk.
Device Groups
Device groups allow you to categorize devices based on various criteria, such as location, function, OS type, or customer. This is useful for:
- Organization: Keeping your device list tidy and manageable.
- Permissions: Granting users access to specific groups of devices.
- Targeted Alerting: Creating alert rules that apply only to devices within a particular group.
- Reporting: Generating reports for specific device sets.
- Dashboard Filtering: Creating dashboards that show data only from certain groups.
Creating and Managing Device Groups:
- Navigate to Devices > Groups > Add Group.
- Name: Give the group a descriptive name (e.g., "Critical Servers," "Core Routers," "London Office").
- Description: Optional, provide more details about the group.
- Type (Dynamic Groups - Optional but Powerful):
- Static Groups: You manually assign devices to these groups.
- Dynamic Groups: Devices are automatically assigned to these groups based on rules you define. This is very powerful.
- Click "Add rule".
- Select a device attribute (e.g.,
sysName
,os
,location
,hardware
). - Choose a condition (e.g.,
contains
,equals
,matches regex
). - Enter a value.
- Example: Create a dynamic group "Linux Servers" with a rule:
os equals linux
. All devices identified with the OS "linux" will automatically be added to this group. - Example: Create a dynamic group "Cisco Routers" with rules:
hardware contains cisco
ANDsysDescr contains IOS
.
- Click Add Group.
Assigning Devices to Static Groups:
- Go to the device's Edit page.
- Find the "Groups" section.
- Select the desired group(s) from the list and save.
Viewing Devices by Group:
- Go to Devices > Groups. Click on a group name to see its members.
- The "All Devices" page can often be filtered by group.
Customizing Device Polling Settings
While global polling settings apply by default, you can override many of them on a per-device basis.
- Navigate to the device's Edit page.
- Go to the SNMP tab (or similar, depending on LibreNMS version structure):
- Override SysName: Use a custom name for the device in LibreNMS instead of the SNMP sysName.
- SNMP Version, Port, Timeout, Retries: You can change these if this specific device uses different SNMP parameters than the global defaults.
- SNMP Community/Auth Details: Update credentials if they change for this device.
- Go to the Modules or Polling tab:
- Enable/Disable specific poller modules: For example, if a server doesn't have BGP configured, you can disable the BGP poller module for that device to save polling time. Conversely, you might enable a specific application poller module only for relevant devices.
- Polling Interval: While global polling interval is set in
config.php
or via cron, some aspects or individual pollers might have configurable frequencies or enable/disable toggles here. Generally, the main 5-minute interval is system-wide. - Discovery and Poller Debugging: Options to run discovery or poller for this device immediately and see the output.
Important considerations:
- Polling Load: Disabling unnecessary modules for devices reduces the load on the poller and the device itself.
- Consistency: While per-device customization is powerful, strive for consistency where possible to simplify management. Use device groups and dynamic rules to manage module assignments if patterns exist.
Managing Device Credentials
Securely managing SNMP community strings and SNMPv3 credentials is vital.
- SNMPv2c Community Strings:
- Uniqueness: Use unique community strings for different sets of devices or security zones. Avoid using "public" or "private" in production.
- Access Control: On the devices themselves, configure SNMP ACLs to restrict access to only the LibreNMS poller IP(s).
- LibreNMS Configuration: When adding/editing a device, you enter the community string. These are stored in the LibreNMS database.
- SNMPv3 Credentials:
- These include username, authentication protocol/password, and privacy protocol/password.
- Offer significantly better security than v2c.
- When adding/editing a device, select SNMPv3 and fill in all required fields.
- Global SNMP Credentials:
- In Global Settings > System > Poller > SNMP Settings, you can define a list of default SNMPv2c community strings and SNMPv3 credentials.
- When LibreNMS discovers a new device, it will try these default credentials in order. This can simplify adding devices if you use a few standard credential sets.
- However, for maximum security, it's often better to specify credentials explicitly when adding a device or use per-device settings.
Best Practices:
- Prefer SNMPv3 where supported.
- Use strong, unique community strings or SNMPv3 passphrases.
- Regularly review and rotate credentials.
- Limit SNMP access on devices to only the LibreNMS poller IP(s).
- Ensure your LibreNMS server itself is well-secured, as it stores these credentials.
Bulk Device Import
Adding devices one by one via the UI is fine for a few devices, but for many devices, it's inefficient. LibreNMS offers ways to add devices in bulk:
-
Using the
addhost.php
CLI script:- Located in
/opt/librenms/addhost.php
. - Allows you to add a device from the command line.
- You can script this to add multiple devices from a list (e.g., a CSV file).
- Example usage:
- To bulk add, you could write a simple shell script:
#!/bin/bash # devices.csv format: hostname,community,version # e.g.: # server1.example.com,comm1,v2c # switch2.example.com,comm2,v2c INPUT_FILE="devices.csv" LIBRENMS_DIR="/opt/librenms" if [ ! -f "$INPUT_FILE" ]; then echo "Input file $INPUT_FILE not found!" exit 1 fi while IFS=, read -r host comm ver; do echo "Adding device: $host with community $comm, version $ver" sudo -u librenms "$LIBRENMS_DIR/addhost.php" "$host" "$comm" "$ver" done < "$INPUT_FILE"
- Located in
-
Auto-Discovery with Network Scanning:
- Configure network ranges for discovery in Global Settings > System > Poller > Discovery settings > Networks to auto-discover.
- Provide a list of SNMP community strings to try in Global Settings > System > Poller > SNMP Settings.
- LibreNMS will scan these networks during its discovery cycle and attempt to add any responsive SNMP devices using the provided communities.
- This is good for initial population but can be less controlled.
-
Using the API:
- LibreNMS has a powerful API that can be used to add devices programmatically. This is suitable for integration with automation tools or CMDBs.
- Requires generating an API token and using tools like
curl
or scripting languages (Python, PowerShell) to make API calls. (More on API in Advanced section).
When importing in bulk, ensure SNMP is already configured on the target devices and they are reachable from the LibreNMS server.
Workshop Organizing Devices with Groups
Objective:
To create both static and dynamic device groups and assign devices to them. This workshop assumes you have at least two devices added to LibreNMS (e.g., localhost
and one other Linux VM, or you can add localhost
twice with different hostnames if you only have one physical machine, just for grouping practice, though LibreNMS might de-duplicate). For better effect, try to have devices with slightly different characteristics (e.g., one identified as 'Linux', another perhaps you can manually edit its OS field for this exercise).
Prerequisites:
- LibreNMS running with at least two devices. If you only have
localhost
, consider adding it again using127.0.0.1
as the hostname so you have two entries to play with for grouping.- Device 1:
localhost
(OS: Linux) - Device 2: (Optional) Another Linux VM, or if not available, add
127.0.0.1
as a new device. You can edit this device's OS manually via Device Edit > Misc > Override OS to something like "Generic Device" or "TestOS" for the sake of creating different dynamic group conditions.
- Device 1:
Tasks:
-
Create a Static Device Group:
- Navigate to Devices > Groups.
- Click Add Group.
- Name:
Lab Servers
- Description:
Servers used in the test lab environment.
- Leave "Rules for dynamic group" empty.
- Click Add Group.
-
Manually Assign a Device to the Static Group:
- Go to Devices > All Devices.
- Click on your
localhost
device. - In the device header, click the Cog icon (Edit).
- Scroll down to the Groups section (or a similar section for group assignment).
- You should see "Lab Servers" in the list of available groups. Select it (it might be a multi-select box or checkboxes).
- Click Save Changes (or "Update Device").
- Verify: Go back to Devices > Groups, click on "Lab Servers".
localhost
should be listed as a member.
-
Create a Dynamic Device Group for "Linux Devices":
- Navigate to Devices > Groups.
- Click Add Group.
- Name:
All Linux Machines
- Description:
Automatically groups all devices running Linux OS.
- In the "Rules for dynamic group" section:
- Click Add rule.
- First dropdown:
Device Attribute
-> Selectos
. - Second dropdown:
Condition
-> Selectequals
. - Text field:
Value
-> Typelinux
(lowercase, as LibreNMS usually standardizes OS names).
- Click Add Group.
- Verification:
- Go back to Devices > Groups. Click on "All Linux Machines".
- Any device LibreNMS has identified with the OS "linux" (like your
localhost
) should automatically appear in this group. This might take a few moments or until the next discovery/poller run that updates group memberships. You can try runningphp ./build-base-dynamic-groups.php
as the librenms user from/opt/librenms
to force an update.
-
Create another Dynamic Device Group (Example: Based on Hostname):
- Navigate to Devices > Groups.
- Click Add Group.
- Name:
Localhost Devices
- Description:
Devices with 'local' in their hostname.
- In the "Rules for dynamic group" section:
- Click Add rule.
Device Attribute
-> SelectsysName
(orhostname
if available, sysName is generally more reliable from SNMP).Condition
-> Selectcontains
.Value
-> Typelocal
.
- Click Add Group.
- Verification: Your
localhost
device should appear in this group. If you added127.0.0.1
and its sysName is also 'localhost', it should appear too.
-
Explore Device Group Usage:
- Go to Devices > All Devices.
- Look for a filter option for "Group." Try filtering by "Lab Servers" and then by "All Linux Machines."
- Consider how these groups could be used in dashboard widgets (many widgets allow filtering by device group) or in alert rules.
Deliverables/Reflection:
- At least one static device group created with a device manually assigned.
- At least one dynamic device group created (e.g., "All Linux Machines") that automatically populates based on device attributes.
- Understanding of how to view devices within a group and filter by group.
Device groups are a powerful organizational tool. Using dynamic groups effectively can save a lot of manual effort as your LibreNMS deployment grows.
7. Alerting and Notifications
Monitoring data is useful, but its true power comes from proactive alerting when things go wrong. LibreNMS has a flexible and powerful alerting system that can notify you of issues before they escalate or affect users.
Understanding Alert Rules
Alert rules are sets of conditions that, when met, trigger an alert. LibreNMS evaluates these rules against the data collected by pollers.
-
Key Components of an Alert Rule:
- Name: A descriptive name for the alert rule (e.g., "Server Down," "High CPU Usage Critical," "Interface Errors").
- Severity: (Optional) Can categorize alerts (e.g., Critical, Warning, Info).
- Device/Entity Association: Rules can be associated with:
- All devices.
- Specific device groups.
- Specific devices.
- Specific entities on devices (e.g., a particular port, sensor, application).
- Conditions: The core logic. A rule triggers if "all" conditions are met OR if "any" condition is met (configurable).
- Each condition checks a metric (e.g.,
device_status
,processors.usage
,mempools.perc
,ports.ifInErrors_rate
). - Against a comparison operator (e.g.,
equals
,greater than
,less than
,matches regex
,not equal to
). - And a value.
- Each condition checks a metric (e.g.,
- Delay: How long the condition must be true before an alert is triggered (e.g., trigger if a device is down for 10 minutes, not just a brief blip). This helps reduce alert noise from transient issues.
- Interval: How often to re-notify if the alert condition persists (e.g., remind every 1 hour).
- Alert Transports: Where to send the notification (e.g., email, Slack, Telegram).
-
Types of Alert Checks: LibreNMS comes with a rich set of built-in alert check types covering various aspects:
- Device Status:
device_down
(checksdevices.status
),device_rebooted
. - Sensor Data:
sensors.sensor_value
(for temperature, humidity, voltage, etc.). - Resource Usage: CPU (
processors.processor_usage
), memory (mempools.mempool_perc
), storage (storage.storage_perc
). - Port/Interface Metrics:
ports.ifOperStatus
(port down),ports.ifInErrors_rate
(input errors),ports.ifOutErrors_rate
(output errors),ports.ifHCInOctets_rate
(traffic rate). - BGP Sessions:
bgpPeers.bgpPeerAdminStatus
. - Applications: Specific metrics for monitored applications.
- Syslog:
syslog.pri
orsyslog.msg
(trigger alerts based on syslog messages). - And many more. You can see available metrics when building a rule.
- Device Status:
Creating Basic Alert Rules
Let's walk through creating a common alert rule, for example, a "Device Down" alert.
- Navigate to Alerts > Alert Rules.
- Click Create alert rule.
- You'll be presented with options:
- "From an existing template": LibreNMS provides several useful pre-built templates. This is often the easiest way to start.
- "From scratch based on a device metric": If you want to build a custom rule not covered by templates.
Example: Creating a "Device Down" Alert from Scratch (or by adapting a template):
- Rule Name:
Device Unreachable (Ping/SNMP)
- Severity (Optional):
Critical
- Query builder / Advanced (Manual query): The query builder is more user-friendly.
- Device Association:
- "Associate with:" ->
All Devices
(or select a specific Device Group like "Critical Servers").
- "Associate with:" ->
- Conditions:
Match all rules (AND)
orMatch any rule (OR)
. For device down,AND
is typical if you check multiple things.- Rule 1:
Metric:
devices.status
(This indicates overall device reachability; 0 usually means down)Condition:
equals
Value:
0
(or sometimesdown
depending on the specific check type's output)
- (Alternatively, many use
macros.device_down
which is a pre-defined condition that istrue
if the device is considered down by LibreNMS.)Metric
:macros.device_down
Condition
:equals
Value
:1
(ortrue
)
- Alert if condition is true for (Delay):
10 minutes
(This means the device must be down for 10 consecutive minutes before the alert fires. Adjust as needed. 0 means alert immediately). - Send recovery notification: Check this if you want a notification when the device comes back up.
- Max alerts / Reminders:
Max alerts to send:
(e.g., 5, then stop sending for this occurrence until it recovers).Remind after X alerts:
(e.g., remind every 1 alert, meaning send every interval).Interval between reminders:
1 hour
(If the device is still down, send another notification every hour).
- Transports: Select the notification transport (e.g., "default_email" or a specific transport you've configured).
- Enable rule: Ensure this is checked.
- Click Save Rule.
LibreNMS comes with many default alert rule templates that you can enable and customize. Look under Alerts > Alert Rules > Create alert rule > From an existing template. Examples:
- Device down
- Port down
- High CPU/Memory/Storage
- High Temperature
Configuring Alert Transports
Alert transports define how you receive notifications. LibreNMS supports numerous transports.
- Navigate to Alerts > Transports.
- Click Create alert transport.
- Transport Name: A descriptive name (e.g., "Admins Email Group," "Network Team Slack").
- Transport Type: Select the desired method from the dropdown:
- Mail: Standard email.
- Default contact: If checked, uses the default email from global settings.
- Email address(es): Enter one or more email addresses, comma-separated.
- API: Send a POST/GET request to a custom API endpoint.
- Boxcar, Discord, Flock, HipChat, IRC, Matrix, Microsoft Teams, PagerDuty, Pushover, Rocket.Chat, Slack, Syslog, Telegram, Twilio, Zammad, Opsgenie, VictorOps, Webhook, etc.
- Mail: Standard email.
- Configuration Fields: Each transport type will have specific fields:
- Mail: Usually just the email address(es). Ensure your server is configured to send mail (e.g., Postfix installed and configured, or using an SMTP relay in
config.php
). - Slack: Webhook URL, Channel, Icon Emoji, Bot Name.
- Telegram: Bot API Token, Chat ID.
- Follow the instructions provided for each transport type, which often involve getting API keys or webhook URLs from the respective services.
- Mail: Usually just the email address(es). Ensure your server is configured to send mail (e.g., Postfix installed and configured, or using an SMTP relay in
- Default Transport: You can mark one transport as the "default." If an alert rule doesn't specify a transport, it will use the default.
- Click Save Transport.
Testing Transports:
- After creating a transport, there's usually a Test Transport button. Use it!
- For email, ensure your LibreNMS server can send emails. This might involve installing and configuring a local MTA like Postfix or Exim, or configuring PHP to use an external SMTP server (often done in
config.php
for LibreNMS).- Example
config.php
for SMTP relay:// Mail configuration $config['email_backend'] = 'smtp'; // mail, sendmail, smtp $config['email_from'] = 'librenms@yourdomain.com'; $config['email_smtp_host'] = 'smtp.example.com'; $config['email_smtp_port'] = 587; // 25, 465, 587 $config['email_smtp_timeout'] = 10; $config['email_smtp_secure'] = 'tls'; // '', ssl, tls $config['email_smtp_auth'] = true; $config['email_smtp_username'] = 'smtp_user@example.com'; $config['email_smtp_password'] = 'smtp_password';
- Example
Alert Templates and Customization
Alert notifications can be customized using templates. These templates define the format and content of the message sent by transports.
- Navigate to Alerts > Templates.
- Click Create template.
- Name: A name for your template (e.g., "Detailed Device Down Email").
- Template Type:
Recovery
(for recovery messages) orIssue
(for alert messages). - Content: This is where you design the message body using HTML (for email) or plain text, along with special variables that LibreNMS will replace with actual data.
- LibreNMS uses the Blade templating engine (or a similar syntax).
- Common Variables:
{{ $alert->title }}
: The title of the alert rule.{{ $alert->hostname }}
: The hostname of the device triggering the alert.{{ $alert_status }}
: "ALERT" or "RECOVERY".{{ $alert->rule }}
: Name of the alert rule.{{ $alert->severity }}
: Severity of the alert.{{ $alert->timestamp }}
: When the alert was triggered.{{ $alert->state }}
: Current state values that triggered the alert.{{ $alert->message }}
: A pre-formatted message from the alert check.{{ $alert->location }}
: Device location.{{ $alert->faults }}
: An array/collection of fault data (e.g., for port down, this would list the port). You might loop through this:
- You can see a list of available variables when editing/creating a template, often in a sidebar or helper section.
- Assign to Alert Rules:
- After creating a template, you can assign it to specific alert rules. When editing an alert rule, there's usually a section to select the "Alert Template" and "Recovery Template."
- If no template is specified for a rule, a default system template is used.
Customizing templates allows you to provide more context, include links back to the device in LibreNMS, or format messages according to your organization's standards.
Workshop Setting Up Email Alerts for Down Devices
Objective: To configure an email transport and an alert rule to notify you via email when a monitored device goes down.
Prerequisites:
- LibreNMS installed and monitoring at least one device (e.g.,
localhost
). - Your LibreNMS server must be able to send emails. This is the trickiest prerequisite. You have a few options:
- Install and configure a local MTA (Mail Transfer Agent) like Postfix or Exim. This is a more involved server administration task. Postfix can be configured to send directly or relay through an external SMTP server.
- Use an external SMTP relay service (e.g., Gmail, SendGrid, Amazon SES, your ISP's SMTP server). This requires configuring SMTP settings in LibreNMS's
config.php
file. For this workshop, we'll assume you can get one of these working. If setting up mail is too complex right now, you can still go through the steps of creating the transport and rule, but the test/actual notification won't work. A simplified Postfix setup for local relay only (might not be deliverable to external emails without further config like SPF, DKIM, DMARC, and reverse DNS):For using an external SMTP (like Gmail - requires "less secure app access" or an app password if 2FA is on): Edit# On your LibreNMS server (Ubuntu) sudo apt update sudo apt install -y postfix mailutils # During Postfix installation, select "Internet Site". # System mail name: use your server's FQDN or `localhost` if it's just for local. # Edit /etc/postfix/main.cf if needed, e.g., mynetworks to allow relay from localhost. # sudo systemctl restart postfix
/opt/librenms/config.php
and add:Replace placeholders with your actual credentials/settings. Restart PHP-FPM if you change<?php // Existing config... // Mail configuration $config['email_backend'] = 'smtp'; $config['email_from'] = 'librenms-alerts@yourdomain.com'; // Use a real or desired from address $config['email_smtp_host'] = 'smtp.gmail.com'; $config['email_smtp_port'] = 587; $config['email_smtp_timeout'] = 10; $config['email_smtp_secure'] = 'tls'; // For Gmail, STARTTLS $config['email_smtp_auth'] = true; $config['email_smtp_username'] = 'your_gmail_address@gmail.com'; $config['email_smtp_password'] = 'your_gmail_app_password_or_regular_password'; // ... any other custom config
config.php
.sudo systemctl restart php8.1-fpm
Tasks:
-
Configure an Email Alert Transport:
- In LibreNMS, navigate to Alerts > Transports.
- Click Create alert transport.
- Transport Name:
Admin Email Alerts
- Transport Type:
Mail
- Default transport: (Optional) Check if this should be your default.
- Email address(es): Enter your personal email address where you want to receive test/actual alerts.
- Click Save Transport.
- After saving, you should see your new transport in the list. Click the Test Transport button (often a paper airplane icon or similar).
- This will attempt to send a test email to the address you configured. Check your inbox (and spam folder).
- If you don't receive it, troubleshoot your mail server configuration on the LibreNMS host or your SMTP settings in
config.php
. Check/opt/librenms/logs/librenms.log
for errors related to mail sending.
-
Enable/Create a "Device Down" Alert Rule:
- Navigate to Alerts > Alert Rules.
- Look for a pre-existing rule template named "Device Down" or "Devices up/down". If it exists:
- Click the Toggle switch to enable it.
- Click the Edit icon (pencil) to review and customize it.
- If no suitable template exists, click Create alert rule.
- Select "From an existing template" and look for "Devices up/down" or similar. If found, select it. This will pre-fill many fields.
- If not, create from scratch:
- Rule Name:
Critical Device Down
- Severity:
Critical
- Device Association:
All Devices
(or a specific group if you prefer). - Conditions:
- Use
Metric
:macros.device_down
,Condition
:equals
,Value
:1
.
- Use
- Alert if condition is true for (Delay):
1 minute
(for faster testing; in production, use 5-15 minutes). - Send recovery notification: Check this box.
- Max alerts / Reminders:
Max alerts to send
: 3.Interval between reminders
:10 minutes
. - Transports: Select
Admin Email Alerts
(the transport you just created). - Enable rule: Ensure this is checked.
- Click Save Rule.
- Rule Name:
-
Test the Alert Rule:
- This requires making a monitored device "go down."
- Choose a test device: Your
localhost
device is a good candidate if it's the only one. - Simulate "Down" Status:
- The easiest way to make
localhost
appear down to SNMP is to stop thesnmpd
service on the LibreNMS server itself.
- The easiest way to make
- Wait for Polling and Alert Trigger:
- LibreNMS polls every 5 minutes by default. The alert rule has a 1-minute delay. So, after stopping
snmpd
, you might need to wait up to 5-6 minutes for the alert to trigger. - Monitor Logs > Event Log. You should see
localhost
being marked as down (SNMP unreachable). - Monitor Alerts > Alerts. A new alert for
Critical Device Down
forlocalhost
should appear. - Check your email. You should receive an alert notification.
- LibreNMS polls every 5 minutes by default. The alert rule has a 1-minute delay. So, after stopping
-
Test Recovery Notification:
- Once you've received the "down" alert, bring the device "back up":
- Wait for Polling and Recovery:
- Again, wait for the next polling cycle (up to 5 minutes).
- Monitor Logs > Event Log.
localhost
should be marked as up again. - Monitor Alerts > Alerts. The active alert for
localhost
should clear (or move to history). - Check your email. You should receive a recovery notification if you enabled it.
-
Review and Cleanup:
- If testing was successful, remember to:
- Adjust the "Device Down" alert rule delay to a more reasonable production value (e.g.,
10 minutes
or15 minutes
). - Ensure
snmpd
is running and enabled on any devices you want to monitor. - Review the alert emails. Are they clear? Do they contain the information you need? If not, consider customizing the alert template later.
- Adjust the "Device Down" alert rule delay to a more reasonable production value (e.g.,
- If testing was successful, remember to:
Deliverables/Reflection:
- A configured email alert transport that successfully sends test emails.
- An active "Device Down" alert rule associated with the email transport.
- Confirmation (via email and LibreNMS UI) that alerts are triggered when a device goes down and recovery notifications are sent when it comes back up.
- Understanding of the delay involved in alert triggering due to polling intervals and rule delays.
This workshop provides a critical capability: getting notified when issues arise. You can adapt these steps to create alerts for many other conditions (high CPU, low disk space, port errors, etc.).
8. Graphing and Data Visualization
LibreNMS excels at collecting vast amounts of time-series data. Its graphing capabilities, primarily powered by RRDtool, allow you to visualize this data, identify trends, troubleshoot issues, and plan for capacity.
Standard Graphs (CPU, Memory, Traffic, etc.)
Out-of-the-box, LibreNMS automatically generates a wide range of standard graphs for most devices it supports, provided the device exposes the necessary data via SNMP.
-
Accessing Graphs:
- Device Level: Navigate to a device's page, then click the Graphs tab. This shows all available graphs for that specific device.
- Port Level: Navigate to a device's Ports tab, click on an interface. The port details page will have its own set of graphs (traffic, errors, discards, etc.).
- Global Graphs: The main Graphs menu item provides access to aggregated graphs or specific graph types across all devices (e.g., "Top CPU Usage," "Total Bandwidth - All Ports").
-
Common Standard Graphs:
- System Uptime: Shows how long the device has been running.
- CPU Usage: Percentage of CPU utilization, often per core or an aggregate.
- Memory Usage: RAM and swap utilization (e.g., total, used, free, cached, buffered). Represented as percentages or absolute values.
- Storage Usage: Disk space utilization for each filesystem/partition (e.g., total, used, free, percentage used).
- Network Interface Traffic (
ifOctets
orifHC
Octets): Bits per second (bps) or Bytes per second (Bps) for input and output traffic on each monitored port. "HC" (High Capacity) counters are 64-bit and essential for interfaces faster than ~600 Mbps to prevent counter wrapping. - Network Interface Errors/Discards: Counts or rates of input/output errors and discards on interfaces. High error rates can indicate physical layer problems, duplex mismatches, or congestion.
- Ping Response Time: Round-trip time for ICMP pings to the device.
- SNMP Response Time: How long it takes to get an SNMP response from the device.
- Health Sensors: Temperature, fan speed, voltage, humidity, etc., if the device has these sensors and LibreNMS supports polling them.
- Specific Application Metrics: If application polling modules are enabled (e.g., Apache requests/sec, MySQL queries/sec).
-
Graph Features:
- Time Range Selection: Crucial for analysis. You can view data from the last hour up to several years (depending on RRD configuration and data retention). Common presets: 6h, 24h, 2d, 1w, 1m, 3m, 6m, 1y, 2y. Custom ranges are also possible.
- Interactive Elements: Some graphs allow hovering over data points to see exact values and timestamps. Zooming capabilities are often present.
- Data Aggregation: RRDtool uses Round Robin Archives (RRAs) to store data at different resolutions. For recent data (e.g., last 24 hours), you might see 5-minute averages. For older data (e.g., last year), you might see hourly or daily averages. This is how RRDtool keeps file sizes fixed and predictable.
- Graph Types: Most are line graphs. Stacked graphs are used for components of a total (e.g., used/free memory). Area graphs are also common.
Customizing Graphs
While standard graphs cover many needs, you can customize their appearance and sometimes what data they display.
- Graph Settings (Per User or Global):
- Some UI settings might allow users to choose default graph styles or color palettes, though deep customization of standard graph rendering often requires code changes or custom graph definitions.
- Device-Specific Graph Overrides:
- For certain graphs on a device, you might find options in the device's edit page or via
config.php
to tweak parameters (e.g., scaling, legends).
- For certain graphs on a device, you might find options in the device's edit page or via
- Creating Custom Graphs (Advanced):
- If LibreNMS doesn't graph a specific OID you need, and it's not part of a standard MIB that LibreNMS auto-detects, you might need to:
- Ensure the OID is being polled: This might involve adding custom OIDs to device OS definitions or creating custom poller modules (covered in Advanced).
- Define a custom graph: This usually involves creating a YAML or PHP graph definition file that tells LibreNMS how to find the RRD data and how to render the graph (title, labels, colors, data sources). This is an advanced topic.
- The community forums and documentation are good resources for examples of custom graph definitions.
- If LibreNMS doesn't graph a specific OID you need, and it's not part of a standard MIB that LibreNMS auto-detects, you might need to:
Understanding RRDtool and Data Storage
RRDtool (Round Robin Database tool) is the backbone of LibreNMS's graphing. Understanding its basics helps interpret graphs correctly.
- Round Robin Database (RRD):
- RRD is a system to store and display time-series data. "Round robin" refers to its fixed-size nature. Old data is consolidated or overwritten to keep the database from growing indefinitely.
- Each metric (e.g., CPU_usage, ifInOctets_for_port1) is typically stored in its own RRD file (e.g.,
/opt/librenms/rrd/<device_hostname>/cpu-XX.rrd
orport-ifInOctets-XX.rrd
).
- RRD File Structure: An RRD file contains:
- Data Sources (DS): Defines the type of data being stored (e.g.,
GAUGE
for values like temperature or CPU %,COUNTER
for continuously increasing values like traffic counters,DERIVE
for rates of change). LibreNMS handles DS definitions based on MIB info. - Round Robin Archives (RRA): Defines how data is stored at different resolutions and for how long. An RRD file can have multiple RRAs.
- Example RRA:
- Store 5-minute averages for 1 day.
- Store 30-minute averages for 1 week.
- Store 2-hour averages for 1 month.
- Store 1-day averages for 1 year.
- When you request a graph for a specific time range, RRDtool selects the most appropriate RRA to fetch data from.
- Example RRA:
- Data Sources (DS): Defines the type of data being stored (e.g.,
- Data Consolidation: As new data comes in, it's fed into the highest resolution RRA. When that RRA is full, data points are consolidated (e.g., averaged) and fed into the next RRA. This is why older data appears less granular.
- Graph Generation: When LibreNMS needs to display a graph, it calls
rrdtool graph
with parameters specifying the RRD file(s), time range, labels, colors, etc. RRDtool then generates an image (usually PNG). rrdcached
(RRD Caching Daemon):- Polling many devices generates frequent writes to RRD files. This can be I/O intensive, especially on slower disks.
rrdcached
is a daemon that acts as a caching layer for RRD updates. Pollers write data torrdcached
, which then flushes these updates to disk in batches, improving performance and reducing disk I/O load.- LibreNMS highly recommends using
rrdcached
, and it's often set up during installation. Configuration is in/etc/default/rrdcached
(or similar) and LibreNMS'sconfig.php
(e.g.,$config['rrdcached'] = "unix:/var/run/rrdcached.sock";
).
Implications of RRDtool: - Fixed Size: You know how much disk space RRDs will take over time. - No Raw Data for Old Points: You can't get the original 5-minute sample from a year ago if it's only stored as a daily average in the RRA. - "NaN" or Gaps: If a device is down or SNMP fails, no data is written to RRDs for that period. This appears as gaps (or "NaN" - Not a Number) in graphs.
Using Smokeping Integration (if applicable)
Smokeping is a separate open-source tool specialized in network latency measurement and visualization. LibreNMS can integrate with an existing Smokeping installation.
- What Smokeping Does:
- Sends out probe packets (ICMP echo, DNS queries, HTTP requests, etc.) to target hosts at regular intervals.
- Measures round-trip time (RTT) and packet loss.
- Generates detailed graphs showing latency distribution, median RTT, and loss over time. This is excellent for visualizing network stability and jitter.
- LibreNMS Integration:
- If you have Smokeping installed and configured to monitor hosts that are also in LibreNMS:
- You can configure LibreNMS (in
config.php
) to point to your Smokeping installation. - LibreNMS will then display relevant Smokeping graphs directly within the LibreNMS device overview page for matched hosts.
- This provides a convenient way to see detailed latency information alongside other device metrics.
- You can configure LibreNMS (in
- Configuration example in
config.php
:
- If you have Smokeping installed and configured to monitor hosts that are also in LibreNMS:
- Benefits: Adds a layer of latency-focused monitoring that complements LibreNMS's broader SNMP-based data collection.
Setting up Smokeping itself is outside the scope of this LibreNMS guide, but if you already use it or plan to, the integration is straightforward.
Workshop Analyzing Network Traffic Patterns
Objective: To use LibreNMS graphs to analyze network traffic patterns for a specific device interface, understand different time scales, and identify peak usage.
Prerequisites:
- LibreNMS running with at least one device that has network interfaces generating some traffic (e.g., your
localhost
device, or better, a router/switch if you have one monitored). - The device should have been monitored for at least a few hours, preferably a day or more, to have some historical data.
Tasks:
-
Identify a Target Interface:
- Log in to LibreNMS.
- Navigate to Devices > All Devices. Select a device.
- Go to the Ports tab for that device.
- Identify an active network interface that is likely to have some traffic (e.g.,
eth0
,ens18
on a server, or a WAN/LAN port on a router). Avoid loopback (lo
) unless it's your only option and has some traffic from local services. - Click on the name of the selected interface to go to its detail page.
-
Examine Basic Traffic Graphs:
- On the port detail page, you should see graphs for "Traffic" (bits per second or Bytes per second) and possibly "Packets" (packets per second).
- Focus on the main "Traffic" graph (often labeled with
ifHC
Octets orifOctets
). It usually shows Inbound and Outbound traffic. - Observe the default time range (e.g., last 24 hours).
-
Analyze Different Time Scales:
- Short Term (e.g., Last 6 Hours):
- Select "6 hour" from the time range options for the graph.
- What do you observe? Are there any spikes? Is the traffic bursty or steady?
- Hover your mouse over different points on the graph lines. What are the approximate Inbound and Outbound traffic rates at those points?
- Medium Term (e.g., Last 24 Hours or Last 48 Hours):
- Select "24 hour" or "2 day".
- Can you identify daily patterns? For example, is traffic higher during business hours and lower at night? Are there regular peaks?
- Note the Y-axis scale. Does it change as you change the time range? Why? (Because RRDtool might be using different RRAs with different consolidation, or the peak values differ).
- Long Term (e.g., Last Week or Last Month):
- Select "1 week" or "1 month" (if you have enough data).
- Can you see weekly patterns? (e.g., higher traffic on weekdays vs. weekends).
- Are there any overall trends, like gradually increasing traffic over the month?
- Notice how the graph lines might appear "smoother" over longer periods. This is due to RRDtool's data aggregation.
- Short Term (e.g., Last 6 Hours):
-
Identify Peak Usage:
- Using the different time scales, try to identify the approximate peak Inbound and Outbound traffic rates for this interface.
- When did these peaks occur?
- What is the unit of the Y-axis (e.g., Mbps, Gbps, kbps)?
- If this interface has a known speed (e.g., 1 Gbps), what percentage of its capacity was used during the peak? (e.g., if peak is 500 Mbps on a 1 Gbps link, that's 50% utilization).
-
Examine Error/Discard Graphs (if data exists):
- On the same port detail page, look for graphs related to "Errors" and "Discards" (e.g.,
ifInErrors
,ifOutErrors
,ifInDiscards
,ifOutDiscards
). - Is there any significant number of errors or discards?
- Constant errors can indicate a physical layer issue (bad cable, faulty SFP), a duplex mismatch, or other problems.
- On the same port detail page, look for graphs related to "Errors" and "Discards" (e.g.,
-
(Optional) Compare with another interface or device:
- If you have other interfaces or devices monitored, repeat some of these steps for them. How do their traffic patterns differ?
Deliverables/Reflection (for your own notes):
- A brief description of the traffic pattern for the chosen interface over 24 hours (e.g., "Traffic peaks between 9 AM and 5 PM, with an average of X Mbps and a peak of Y Mbps. Low traffic overnight.").
- The approximate peak Inbound and Outbound bandwidth usage observed and when it occurred.
- An observation about how graph granularity changes with different time scales.
- A note on whether any significant interface errors or discards were observed.
This workshop helps you practice interpreting the visual data LibreNMS provides, which is a fundamental skill for network monitoring and management. Understanding these graphs allows you to assess network performance, plan capacity, and spot potential problems.
Advanced LibreNMS Customization and Optimization
Once you have mastered the basics and intermediate features of LibreNMS, you can explore its more advanced capabilities. This section covers extending LibreNMS with custom definitions and pollers, optimizing its performance for larger environments, implementing robust security practices, and effectively troubleshooting common issues.
9. Extending LibreNMS
LibreNMS is highly extensible, allowing you to tailor it to monitor virtually any device or application that exposes data. This often involves understanding MIBs, OIDs, and sometimes writing small scripts or configuration snippets.
Custom Device OS Definitions
LibreNMS uses "OS Definitions" to determine how to discover, poll, and interpret data from different types of devices. While it supports a vast number of OS types out-of-the-box, you might encounter a device that isn't fully recognized or for which you want to poll additional, specific MIBs.
-
Structure of OS Definitions:
- OS definitions are typically YAML files located in
/opt/librenms/LibreNMS/OS/
(or a custom directory specified inconfig.php
). - Each YAML file defines:
os
: The short name for the OS (e.g.,linux
,cisco-ios
,mycustomos
).text
: A human-readable description (e.g., "Linux", "Cisco IOS", "My Custom Device").type
: General category (e.g.,server
,network
,firewall
,wireless
).icon
: An icon to display in the UI.discovery
:sysObjectID
: A list of SNMPsysObjectID
values that identify this OS. LibreNMS uses this during discovery to match a device to an OS definition.sysDescr
: A list of regex patterns to match against the device'ssysDescr
string.modules
: A list of discovery modules to run for this OS.
mib_dir
: (Optional) A directory containing custom MIB files specific to this OS. LibreNMS will attempt to load MIBs from here.poller_modules
/global_modules
: Specifies which poller modules should be enabled or disabled by default for this OS.graphs
: A list of custom graph definitions to be made available for this OS.bad_ifType
,bad_ifName_regexp
,good_ifAlias_regexp
: Rules to filter out unwanted interfaces.os_group
: Assigns the OS to a group for easier management.
- OS definitions are typically YAML files located in
-
Adding Custom MIBs:
- If your device uses vendor-specific MIBs not included with LibreNMS or
net-snmp
, you need to obtain the MIB files (usually from the vendor). - Place these MIB files in a directory (e.g.,
/opt/librenms/mibs/custom/
). - Add this directory to the MIB search path in LibreNMS's
config.php
: - You may also need to configure
net-snmp
itself to be aware of these MIBs if you are usingsnmpwalk
or other tools outside of LibreNMS that need to translate OIDs.
- If your device uses vendor-specific MIBs not included with LibreNMS or
-
Creating or Modifying an OS Definition:
- Identify the device: Use
snmpwalk -v2c -c YOUR_COMMUNITY YOUR_DEVICE_IP sysObjectID sysDescr
to get thesysObjectID
andsysDescr
. - Copy an existing YAML: Find an OS definition YAML file in
/opt/librenms/LibreNMS/OS/
that is similar to your device and copy it to create a new file (e.g.,mycustomos.yaml
). - Edit the YAML:
- Change
os
,text
,sysObjectID
, andsysDescr
to match your device. - Adjust poller modules, discovery modules, and graphs as needed.
- If you want to poll specific OIDs not covered by standard modules, you might need to look into creating custom graph definitions or even custom poller modules.
- Change
- Test: After saving, run discovery for the device (
./discovery.php -h <hostname> -d -m os
) and then polling (./poller.php -h <hostname> -d -r -f
) to see if it's correctly identified and if new data/graphs appear.
- Identify the device: Use
This process can be iterative. The LibreNMS community forums are a good place to ask for help or find examples for specific devices.
Application Monitoring (e.g., Apache, Nginx, MySQL)
LibreNMS can monitor specific applications running on your servers, not just OS-level metrics. This is typically achieved using "Application Pollers."
-
How it Works:
- Application pollers are scripts (usually Bash or Python) that run on the LibreNMS server.
- They connect to the target server (often using SNMP
extend
scripts, NRPE, or custom agent scripts deployed on the monitored host) to gather application-specific metrics. - The data is then returned to LibreNMS and stored in RRD files, similar to other SNMP data.
- LibreNMS comes with pre-built application pollers for many common services like Apache, Nginx, MySQL, BIND, Memcached, NTP, Postfix, etc.
-
Enabling Application Monitoring:
- Agent Setup (on the monitored host):
- For many applications, you need to install an agent script on the host running the application. These scripts are often provided by LibreNMS (in
/opt/librenms/scripts/agent-local/
or via thelibrenms-agent
repository). - Example: For MySQL, you might deploy a script that queries MySQL status variables.
- These scripts need to be executable by the
snmpd
user or callable via another mechanism like NRPE. - Configure
snmpd
on the monitored host to expose the output of these scripts via anextend
directive insnmpd.conf
: This makes the script's output available via a specific OID that LibreNMS knows how to query.
- For many applications, you need to install an agent script on the host running the application. These scripts are often provided by LibreNMS (in
- LibreNMS Configuration (on the LibreNMS server):
- In the LibreNMS web UI, go to the Device Edit page for the server running the application.
- Go to the Applications tab (or Modules tab, then find "Applications").
- Enable the specific application poller (e.g., "MySQL," "Apache").
- Some applications might require additional configuration (e.g., database credentials for MySQL, status URL for Apache/Nginx). These are usually configured directly in the agent script on the monitored host or sometimes passed via SNMP SETs if the agent supports it (less common).
- Agent Setup (on the monitored host):
-
Viewing Application Data:
- Once configured and polled, application-specific metrics and graphs will appear under the Applications tab on the device's overview page in LibreNMS.
-
Creating Custom Application Pollers:
- If LibreNMS doesn't have a poller for an application you need:
- Write a script that can collect the desired metrics from your application and output them in a simple key:value format (e.g.,
metric1:value1\nmetric2:value2
). - Deploy this script on the monitored host and configure
snmpd
(or another agent mechanism) to expose its output. - In LibreNMS, you'll need to create a new application poller definition (usually a YAML file in
/opt/librenms/LibreNMS/ rýchlo /Applications/
). This YAML defines the OIDs to query (from yourextend
setup), how to parse the data, and what graphs to create. This is a more involved development task.
- Write a script that can collect the desired metrics from your application and output them in a simple key:value format (e.g.,
- If LibreNMS doesn't have a poller for an application you need:
Writing Custom Pollers and Discovery Modules (Introduction)
For highly specialized devices or data sources not covered by existing mechanisms, you might need to write custom poller or discovery modules in PHP. This is an advanced topic requiring PHP programming skills and a good understanding of the LibreNMS architecture.
-
Discovery Modules (
includes/discovery/
):- These PHP scripts are responsible for identifying device characteristics, hardware components, sensors, etc.
- They are executed during the discovery phase.
- If you have a device with unique hardware components that LibreNMS doesn't recognize, you might write a discovery module to query specific MIBs and register these components (e.g., new types of sensors, power supplies, fans).
-
Poller Modules (
includes/polling/
):- These PHP scripts are responsible for collecting time-series data for specific metrics.
- They are executed during the polling phase.
- If you need to graph data from custom OIDs or a non-SNMP data source, you would write a poller module.
- The module would fetch the data, format it, and then use LibreNMS functions like
rrdtool_update_ng()
to store it in RRD files. - You'd also need corresponding graph definitions to visualize this data.
-
General Process:
- Understand the Data: Know the OIDs or API endpoints to get the data.
- PHP Scripting: Write PHP code to fetch, parse, and process the data.
- Integration: Place your script in the appropriate LibreNMS directory and potentially update OS definitions or
config.php
to call your module. - Graph Definitions: Create YAML graph definitions to display the data collected by your poller module.
Developing these modules requires careful study of existing LibreNMS modules and the internal API. The LibreNMS development documentation and community are vital resources.
Using the LibreNMS API
LibreNMS provides a comprehensive RESTful API that allows you to interact with it programmatically. This is useful for:
- Automation (adding/deleting devices, managing users).
- Integration with other systems (CMDBs, ticketing systems, custom dashboards).
-
Extracting data for custom reporting or analysis.
-
Enabling API Access:
- In the LibreNMS web UI, navigate to Gear Icon > API Settings > API Access.
- Click Create API access token.
- Give the token a description, assign it to a user (permissions will be based on this user), and set an expiration if desired.
- The generated token will be displayed once – copy it immediately and store it securely. You won't be able to see it again.
-
API Endpoints:
- The API documentation is usually available directly from your LibreNMS instance at
http://your-librenms-host/api/v0
(or a link from Gear Icon > API Settings). - It lists all available endpoints, expected parameters, and example responses.
- Common endpoints include:
/devices
(GET, POST, DELETE for managing devices)/ports
(GET for port information)/health
(GET for sensor data)/alerts
(GET for alert information)/services
(GET for monitored services/applications)- And many more.
- The API documentation is usually available directly from your LibreNMS instance at
-
Authentication:
- API requests must include the API token in the
X-Auth-Token
HTTP header.
- API requests must include the API token in the
-
Example: Adding a device via API using
curl
:API_TOKEN="YOUR_API_TOKEN_HERE" LIBRENMS_URL="http://your-librenms-host" HOSTNAME_TO_ADD="new-server.example.com" COMMUNITY="public" VERSION="v2c" curl -X POST -H "X-Auth-Token: $API_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "hostname": "'"$HOSTNAME_TO_ADD"'", "snmp_community": "'"$COMMUNITY"'", "snmp_version": "'"$VERSION"'" }' \ "$LIBRENMS_URL/api/v0/devices"
-
Rate Limiting: Be aware of API rate limits (configurable in LibreNMS global settings or
config.php
) to prevent abuse.
The API is a powerful tool for advanced users and developers looking to integrate LibreNMS into broader workflows.
Workshop Adding a Custom Application Monitor (Simple Example)
Objective: To create a very simple custom application monitor that checks if a specific TCP port is open on a server. This will involve a local agent script, SNMP extend
, and a basic application definition in LibreNMS.
Prerequisites:
- LibreNMS installed and running.
- A Linux server to monitor (can be the LibreNMS server itself or another VM). Let's call it
target-server
. - SSH access to
target-server
. net-tools
orss
installed ontarget-server
(fornetstat
orss
command).- We'll check if, for example, TCP port 80 (HTTP) is listening on
target-server
.
Tasks:
Part 1: Create and Deploy the Agent Script on target-server
-
Create the script:
- Log in to
target-server
. - Create a script, for example,
/usr/local/bin/check_port_80.sh
: - Add the following content:
#!/bin/bash # Script to check if TCP port 80 is listening # Use netstat or ss. ss is generally preferred if available. if command -v ss &> /dev/null; then # ss -tlpn | grep -q ':80 ' # -q for quiet, exit status is enough # For more robust check, ensure it's LISTEN state ss -Hltn sport = 80 | grep -q LISTEN elif command -v netstat &> /dev/null; then netstat -tlpn | grep -q ':80 ' # Adjust grep if needed for your netstat output else echo "port_80_status:-1" # Indicate error if no tool found exit 1 fi if [ $? -eq 0 ]; then echo "port_80_status:1" # Port is listening else echo "port_80_status:0" # Port is not listening fi exit 0
- Make the script executable:
- Test the script:
- If a web server IS listening on port 80:
sudo /usr/local/bin/check_port_80.sh
should outputport_80_status:1
. - If NOT: it should output
port_80_status:0
. - You can temporarily start a simple listener for testing:
sudo python3 -m http.server 80
(and stop with Ctrl+C after test).
- If a web server IS listening on port 80:
- Log in to
-
Configure
snmpd
ontarget-server
to expose this script:- Edit
/etc/snmp/snmpd.conf
: - Add an
extend
line (choose an unused OID branch, e.g., under.1.3.6.1.4.1.2021.7890.x
or a custom enterprise OID if you have one):Note: A simpler way just using a name, letting snmpd assign an OID automatically under# Custom check for TCP Port 80 status extend port80mon .1.3.6.1.4.1.8072.1.3.2.3.1.1.8.port80mon /usr/local/bin/check_port_80.sh # The OID .1.3.6.1.4.1.8072.1.3.2.3.1.1.8 is part of NET-SNMP-EXTEND-MIB::nsExtendOutput1Line."port80mon" # The final part "port80mon" is the token name for this extend.
nsExtendObjects
(.1.3.6.1.4.1.8072.1.3.2
): Let's use this simpler named extend. - Restart
snmpd
: - Test SNMP query from LibreNMS server (or locally on
target-server
ifsnmpwalk
is installed there): ReplaceYOUR_COMMUNITY
andTARGET_SERVER_IP
.Note the exact OID that returns the STRING value. Forsnmpwalk -v2c -c YOUR_COMMUNITY TARGET_SERVER_IP NET-SNMP-EXTEND-MIB::nsExtendOutput1Line # You should see output similar to: # NET-SNMP-EXTEND-MIB::nsExtendOutput1Line."port80_check" = STRING: port_80_status:1 # Or if you used a specific OID, walk that OID.
extend port80_check ...
, the OID for the output line would be something like.1.3.6.1.4.1.8072.1.3.2.3.1.1.12.port80_check.1
(the number12
might vary, it's the length ofport80_check
). It's easier to discover this viasnmpwalk -v2c -c YOUR_COMMUNITY TARGET_SERVER_IP nsExtendOutput1Line
and find theport80_check
entry.
- Edit
Part 2: Configure LibreNMS to Poll and Graph This Metric
-
Create an Application Definition YAML:
- On your LibreNMS server, navigate to
/opt/librenms/LibreNMS/ rýchlo /Applications/
(create directory if it doesn't exist, but it should be/opt/librenms/LibreNMS/OS/Applications/
or similar based on recent LibreNMS structure, checkincludes/definitions/applications/
in your version). For this example, let's assume new app definitions go intoincludes/definitions/applications/
. - Create a YAML file, e.g.,
custom_port_check.yaml
: -
Add the following content. You'll need to replace
YOUR_OID_FOR_PORT80_CHECK_OUTPUT
with the actual OID you found fromsnmpwalk
that returns the stringport_80_status:X
. Forextend port80_check ...
, the OID for the output string isNET-SNMP-EXTEND-MIB::nsExtendOutputFull."port80_check"
. You can use names if MIBs are loaded, or numerical OIDs. Let's use a simplified example assuming output is numerical (0 or 1 directly from script). If the script outputsport_80_status:1
, then LibreNMS needs to parse this. Thecollectd_format: true
is for scripts formatted like collectd exec scripts. Our script is simpler. We need to adjust the script to output just the value, or LibreNMS needs to parse "port_80_status:X". LibreNMS application pollers often expect direct numerical output or specific formats.Let's simplify the agent script output for this example to just
0
or1
:Update the script on#!/bin/bash # Script to check if TCP port 80 is listening - outputs 0 or 1 if command -v ss &> /dev/null; then ss -Hltn sport = 80 | grep -q LISTEN elif command -v netstat &> /dev/null; then netstat -tlpn | grep -q ':80 ' else echo "-1" # Error exit 1 fi if [ $? -eq 0 ]; then echo "1" # Port is listening else echo "0" # Port is not listening fi exit 0
target-server
and re-testsnmpwalk
for the output. It should now be justSTRING: "1"
orSTRING: "0"
.Now, the YAML (
/opt/librenms/includes/definitions/applications/custom_port_check.yaml
):Note: The OID forapp: custom_port_check # Unique identifier for this app poller name: "Custom TCP Port 80 Check" # Human-readable name version_oid: # Optional: OID to get app version, not needed here data: - { graph: port_80_status, # Name of the RRD file and graph definition oid: NET-SNMP-EXTEND-MIB::nsExtendOutputFull."port80_check", # OID for the script output # If using numerical OID: e.g., .1.3.6.1.4.1.8072.1.3.2.4.1.2.12.port80_check.1 (replace 12 with actual length of "port80_check") # This OID should return the raw value (0 or 1) as a string. # LibreNMS will convert string "0" or "1" to a number. ds_name: status, # Data source name within the RRD type: GAUGE, # Data source type descr: "Port 80 Listening Status (1=Listening, 0=Not Listening)" } graphs: # Define how to graph this data - port_80_status # Matches 'graph' name above
nsExtendOutputFull
provides the full output (possibly multi-line).nsExtendOutput1Line
provides only the first line. Since our script is single line, either should work. The MIB nameNET-SNMP-EXTEND-MIB::nsExtendOutputFull."port80_check"
is generally preferred if yoursnmpd
and LibreNMS can resolve it. If not, use the numeric OID.
- On your LibreNMS server, navigate to
-
Enable the Application Monitor in LibreNMS UI:
- Go to the device page for
target-server
in LibreNMS. - Click Edit (Cog Icon), then go to the Applications tab.
- You should see "Custom TCP Port 80 Check" in the list of available applications. Enable it.
- Click Save Changes.
- Go to the device page for
-
Wait for Polling and Check Data:
- LibreNMS will poll this new application data during its next regular polling cycle for the device.
- After 5-10 minutes, go to the
target-server
device page in LibreNMS. - Click on the Applications tab. You should see an entry for "Custom TCP Port 80 Check."
- Click on it to see the graph for "Port 80 Listening Status." It should show a line at 1 (if listening) or 0 (if not).
- You can now create an alert rule based on this application metric (e.g., if
applications.app_custom_port_check_port_80_status.status
equals0
).
Troubleshooting:
- If data doesn't appear, use poller debug for the application:
Look for output related to
# On LibreNMS server cd /opt/librenms sudo -u librenms ./poller.php -h TARGET_SERVER_HOSTNAME -d -m applications
custom_port_check
and any errors. - Verify OIDs carefully with
snmpwalk
. - Ensure the agent script on
target-server
is working correctly andsnmpd
is restarted after config changes. - Check
/opt/librenms/logs/librenms.log
for errors.
Deliverables/Reflection:
- A working agent script on a target server that reports a specific status (port listening).
snmpd
configured with anextend
directive to expose the script's output.- A custom application YAML definition in LibreNMS.
- The custom application monitor enabled for the target device in LibreNMS UI.
- A graph appearing in LibreNMS showing the status reported by your custom script.
This workshop, while simplified, demonstrates the fundamental process of extending LibreNMS to monitor custom metrics via SNMP extend
scripts and application pollers. This pattern can be adapted for many types of custom checks.
10. Performance Tuning and Scaling
As the number of monitored devices and services grows, LibreNMS performance can become a concern. Optimizing various components and potentially scaling out with distributed pollers are key to maintaining a responsive and reliable monitoring system.
Optimizing Database Performance (MySQL/MariaDB tuning)
The database is a critical component. Slow database queries can impact the web UI, poller performance, and alert processing.
- Hardware:
- SSDs: Use SSDs for your database storage. This is one of the most significant improvements.
- RAM: Ensure sufficient RAM for the database server. MariaDB/MySQL use RAM for caching (e.g., InnoDB buffer pool). More RAM means more data can be served from cache, reducing disk I/O.
- MariaDB/MySQL Configuration (
my.cnf
or50-server.cnf
):innodb_buffer_pool_size
: This is the most important setting for InnoDB tables (which LibreNMS uses). It defines the size of the memory cache for table data and indexes.- A common recommendation is to set this to 50-70% of available system RAM if the database server is dedicated. If it shares resources (like on the LibreNMS server itself), be more conservative.
- Example: On a server with 8GB RAM dedicated to DB,
innodb_buffer_pool_size = 4G
or6G
might be appropriate. - Monitor buffer pool hit rate to see if it's effective.
innodb_log_file_size
: Size of the redo logs. Larger logs can improve write performance but increase recovery time. A common starting point is256M
or512M
. If changing, you need to stop mysqld, remove old log files, and restart.innodb_flush_log_at_trx_commit
: Controls durability vs. performance.1
(default): Fully ACID compliant, flushes log to disk at each transaction commit (safest, but can be slower).2
: Flushes log to OS cache at commit, flushes to disk once per second (good balance, much faster writes, small risk of 1-second data loss on OS crash). Often recommended for LibreNMS if minor data loss on crash is acceptable for performance.0
: Flushes log to disk once per second (fastest, higher risk of data loss on crash).
query_cache_size
/query_cache_type
: The query cache is generally deprecated in modern MySQL/MariaDB versions (often disabled by default or removed) as it can cause contention issues. For most LibreNMS workloads, it's better left disabled or set to a very small size if enabled.max_connections
: Maximum number of concurrent client connections. Default is often 151. LibreNMS pollers and web UI can open multiple connections. MonitorThreads_connected
andMax_used_connections
status variables. Increase if you're hitting the limit, but ensure the server has resources.tmp_table_size
andmax_heap_table_size
: Affect temporary tables created for complex queries. If you see many on-disk temporary tables, increasing these (within RAM limits) can help.join_buffer_size
,sort_buffer_size
,read_rnd_buffer_size
: Per-session buffers. Increasing them globally can consume a lot of RAM. Adjust cautiously, or consider setting them per-session for problematic queries if identified.
- Tools for Tuning:
- MySQLTuner-perl: A script that analyzes your database server's configuration and status variables and provides recommendations. Run it after the server has been active for at least 24-48 hours under normal load.
- Percona Toolkit: Includes tools like
pt-query-digest
for analyzing slow query logs.
- Slow Query Log: Enable the slow query log in MariaDB/MySQL to identify queries that are taking too long.
Analyze this log to find bottlenecks. Sometimes, adding an index or rewriting a query (if it's from custom code) can help. For LibreNMS core queries, ensure your schema is up-to-date (
# In my.cnf slow_query_log = 1 slow_query_log_file = /var/log/mysql/mysql-slow.log long_query_time = 2 # Log queries taking longer than 2 seconds # log_queries_not_using_indexes = 1 # Optional, logs queries not using indexes
./scripts/database-schema.sh
). - Regular Maintenance:
- Run
OPTIMIZE TABLE
on frequently updated tables periodically (e.g.,events
,syslog
). This can reclaim space and reduce fragmentation. LibreNMSdaily.sh
might handle some of this. - Ensure your LibreNMS database schema is up-to-date via
./validate.php
or./scripts/database-schema.sh
.
- Run
Distributed Polling
When monitoring a large number of devices (hundreds or thousands) or devices across different geographical locations or isolated network segments, a single LibreNMS poller can become overwhelmed. Distributed polling allows you to scale out the polling load.
- Concept:
- One central LibreNMS web server and database.
- Multiple poller instances running on different servers (these are the "distributed pollers").
- Pollers are organized into "poller groups."
- Devices are assigned to a specific poller group. The pollers in that group are then responsible for polling those devices.
- Distributed pollers communicate with the main LibreNMS database to get their list of devices and write back polling data (RRD files are typically still written by the distributed pollers locally and then synced, or
rrdcached
on the main server is used by all).
- Components:
- Main LibreNMS Server: Hosts the web UI, database, and central configuration. It may or may not do polling itself.
- Poller Servers: Lightweight servers (can be VMs) running the LibreNMS poller code (
poller-wrapper.py
orpoller.php
). They need network connectivity to the devices they poll AND to the central LibreNMS database andrrdcached
(if used centrally). rrdcached
: Can be run on the main LibreNMS server and accessed by all pollers, or each poller group can have its ownrrdcached
that syncs RRDs back to a central store. Centralrrdcached
is common.- File Synchronization (e.g., rsync): RRD files generated by distributed pollers need to be available to the web UI on the main server for graphing.
rsync
is commonly used to synchronize RRDs from pollers to the main server ifrrdcached
isn't handling all writes centrally.
-
Setup Steps (High-Level):
- Prepare Poller Servers: Install OS, basic dependencies (PHP, snmp tools, python for
poller-wrapper.py
). No full web server or DB needed on these. - Install LibreNMS Code: Clone the LibreNMS git repository onto each poller server (same version as the main server).
- Configure
config.php
on Pollers: Point them to the central database and centralrrdcached
. - Poller Groups: In LibreNMS UI (Global Settings > Polling > Distributed Pollers or Gear Icon > Pollers > Poller Groups):
- Define poller groups (e.g., "US-East-Pollers," "Datacenter1-Pollers"). Default is group 0.
- Each poller server needs to be configured with a unique poller ID (hostname) and assigned to a poller group in its
config.php
or via environment variables.
- Assign Devices to Poller Groups: In the device settings (Edit device > Modules/Polling), assign the device to the appropriate poller group.
- Set up Cron Jobs on Pollers: Each poller server runs the standard LibreNMS cron jobs (discovery, poller, etc.), but they will only act on devices assigned to their group.
- RRD Synchronization: If RRDs are written locally by pollers, set up
rsync
jobs to copy RRD files from each poller's/opt/librenms/rrd
directory to the main server's/opt/librenms/rrd
directory. This needs to be done carefully to avoid conflicts (e.g., pollers for different groups should write to distinct subdirectories if syncing to a common RRD path, or ensure devices are strictly partitioned by group). The most robust method is often using a centralrrdcached
instance that all pollers write to. - Time Synchronization: Crucial. All poller servers and the main server must have their time synchronized via NTP.
- Prepare Poller Servers: Install OS, basic dependencies (PHP, snmp tools, python for
-
Benefits:
- Improved poller performance and reduced polling cycle times.
- Ability to monitor devices in isolated networks (poller placed within that network).
- Increased redundancy (if one poller in a group fails, others can potentially take over if configured for HA, though this is more complex).
Distributed polling adds complexity but is essential for large-scale deployments.
Web Server Optimization (Nginx/Apache)
The web server serving the LibreNMS UI can also be a bottleneck, especially with many concurrent users or frequent API calls.
-
Nginx (Recommended):
- PHP-FPM Tuning: The number of PHP-FPM child processes and their configuration is critical.
- In
/etc/php/VERSION/fpm/pool.d/www.conf
(or a dedicatedlibrenms.conf
pool):pm
: Process manager.dynamic
orondemand
are common.static
can be used if you know exact needs.pm.max_children
: Max number of concurrent PHP requests. Depends on RAM per child and total available RAM. If too low, users see delays/errors. If too high, server can run out of memory.pm.start_servers
,pm.min_spare_servers
,pm.max_spare_servers
(fordynamic
PM).pm.process_idle_timeout
(forondemand
PM).listen
: Ensure it matches thefastcgi_pass
directive in your Nginx config.- Ensure PHP-FPM runs as the
librenms
user for correct file permissions.
- In
- Nginx Worker Processes:
- In
/etc/nginx/nginx.conf
:worker_processes
: Typically set to the number of CPU cores, orauto
.worker_connections
: Max connections per worker.
- In
- Caching:
- Enable browser caching for static assets (JS, CSS, images) using
expires
headers in Nginx. - Consider
fastcgi_cache
for caching PHP responses for frequently accessed, non-dynamic pages (use with caution for a dynamic UI like LibreNMS, might be better for API).
- Enable browser caching for static assets (JS, CSS, images) using
- Keepalive Connections: Enable HTTP keepalive to reduce connection overhead.
- Gzip Compression: Enable
gzip
for text-based content (HTML, CSS, JS, JSON) to reduce bandwidth.
- PHP-FPM Tuning: The number of PHP-FPM child processes and their configuration is critical.
-
Apache:
- MPM Module: Choose the right Multi-Processing Module (MPM):
mpm_event
(default in newer Apache): Good for high concurrency, uses threads.mpm_worker
: Also threaded, older than event.mpm_prefork
: Uses processes, less memory efficient but sometimes considered more stable for non-thread-safe PHP modules (thoughmod_php
with prefork is less common now than PHP-FPM). When using PHP-FPM with Apache (viamod_proxy_fcgi
), Apache's MPM choice is less critical for PHP performance itself, butmpm_event
is still generally preferred for Apache's own efficiency.
- PHP Handler: Use PHP-FPM with
mod_proxy_fcgi
for better performance and flexibility thanmod_php
. - MaxRequestWorkers / ServerLimit (for event/worker MPMs): Similar to
pm.max_children
for PHP-FPM, controls concurrent requests Apache can handle. - KeepAlive, Gzip, Caching: Similar principles as Nginx.
- MPM Module: Choose the right Multi-Processing Module (MPM):
-
General Web Server Tips:
- Use HTTPS (SSL/TLS) for security. Modern CPUs have AES acceleration, so performance impact is often minimal. HTTP/2 can also improve performance.
- Monitor web server logs and PHP-FPM logs for errors or performance issues.
Managing rrdcached
As mentioned, rrdcached
is crucial for RRD I/O performance.
- Configuration:
- Usually configured in
/etc/default/rrdcached
(Debian/Ubuntu) or via systemd unit. - Key options:
OPTS
: Command-line options forrrdcached
.-l unix:/var/run/rrdcached.sock
: Listen on a Unix socket (common). Or-l IP:PORT
to listen on a TCP socket (needed if pollers are on different hosts).-w <timeout>
: Write timeout (e.g., 1800s). How long data can stay in cache before being flushed.-f <timeout>
: Flush timeout (e.g., 3600s). How long beforerrdcached
forces a flush of all pending writes for an RRD file if it hasn't been updated.-p <pidfile>
: Path to PID file.-j <journal_dir>
: Directory for RRD journal files (for recovery ifrrdcached
crashes).-B
: Run in background.-R
: Allow recursive directory creation (check permissions).-t <num_threads>
: Number of write threads.
- Ensure the Unix socket or TCP port is accessible by LibreNMS pollers (and the web UI if it also writes/reads through
rrdcached
). - Permissions: The
rrdcached
process (and its journal directory) must have write access to the RRD files/directory (/opt/librenms/rrd
). Thelibrenms
user typically owns RRDs.rrdcached
might run as its own user or thelibrenms
user.
- Usually configured in
- LibreNMS
config.php
: - Monitoring
rrdcached
:- Use
rrdtool C<sock_path> stats
to get stats fromrrdcached
(queue length, number of flushes, etc.). - Monitor its log files if configured.
- Use
- Sizing Cache:
rrdcached
primarily caches writes. The actual "cache" size isn't configured like a database buffer pool; it's more about managing write queues and journal files. Ensure sufficient disk space for the journal if enabled.
Workshop Setting up a Distributed Poller (Conceptual or Simplified)
Objective:
To understand the configuration steps for a distributed poller. A full multi-VM setup can be complex for a workshop. We'll outline the key steps and, if possible, simulate a second "poller" on the same machine for conceptual understanding (not a production setup).
Scenario A: Full Conceptual Outline (Ideal but resource-intensive for a workshop)
-
VM1: Main LibreNMS Server:
- Already installed and running LibreNMS, database, web UI.
rrdcached
running and configured to listen on a TCP socket (e.g.,0.0.0.0:42217
).- In UI: Gear Icon > Pollers > Settings: Ensure
poller-wrapper.py
is selected if you intend to use it. - In UI: Gear Icon > Pollers > Poller Groups: Create a new group, e.g., Group ID
1
, NameRemoteSitePollers
.
-
VM2: Distributed Poller Server:
- Fresh OS (e.g., Ubuntu Server).
- Install dependencies:
git
,python3
(forpoller-wrapper.py
),php-cli
,php-snmp
,snmp
,fping
, etc. (core poller dependencies, no web server or DB needed). - Clone LibreNMS:
sudo git clone https://github.com/librenms/librenms.git /opt/librenms
- Create
librenms
user and set permissions on/opt/librenms
. - Create
/opt/librenms/config.php
with:<?php // Database config - point to VM1's database $config['db_host'] = 'IP_OF_VM1_MAIN_LIBRENMS'; $config['db_user'] = 'librenms'; $config['db_pass'] = 'your_db_password'; $config['db_name'] = 'librenms'; $config['user'] = 'librenms'; // User LibreNMS runs as // Distributed Poller Config $config['distributed_poller'] = true; $config['poller_name'] = 'poller-vm2'; // Unique name for this poller $config['poller_group'] = 1; // Assign to group 1 created on main server $config['rrdcached'] = 'IP_OF_VM1_MAIN_LIBRENMS:42217'; // Point to central rrdcached // Ensure this poller doesn't try to run web UI related tasks $config['web_dir'] = null; $config['install_dir'] = '/opt/librenms'; // Add any other necessary base configs usually found in config.php $config['snmp']['community'] = array("public", "your_other_communities"); // ...
- Copy cron job:
sudo cp /opt/librenms/librenms.cron /etc/cron.d/librenms
(or thepoller-wrapper.py
cron setup). - RRD directory: The RRDs will be written via
rrdcached
to VM1. Ensure/opt/librenms/rrd
exists on VM2, but it might not be heavily used if all writes go through centralrrdcached
. - Ensure VM2 can reach VM1's database (port 3306) and
rrdcached
(port 42217). Adjust firewalls. - Database user
librenms
on VM1 must be configured to allow connections from VM2's IP (e.g.,GRANT ALL ON librenms.* TO 'librenms'@'IP_OF_VM2' IDENTIFIED BY 'your_db_password';
).
-
On Main LibreNMS Server (VM1):
- Assign a device (or several) to Poller Group
1
via the device's Edit page. - In UI: Gear Icon > Pollers > Pollers: After VM2's poller service starts and communicates,
poller-vm2
should appear in the list, associated with group 1.
- Assign a device (or several) to Poller Group
-
Verify:
- Monitor poller logs on VM2 (
/opt/librenms/logs/librenms.log
). - Check
poller_perf
in the database or UI to see ifpoller-vm2
is polling its assigned devices. - Graphs for devices in group 1 should update.
- Monitor poller logs on VM2 (
Scenario B: Simplified Conceptual Simulation (Single Machine - NOT for Production)
This is to understand file structure and config, but poller-wrapper.py
handles multi-poller on one host better.
This is more for a very basic illustration of separate config for a "different" poller.
- On your existing LibreNMS server:
- In UI: Create Poller Group
1
. - Assign one of your test devices to Poller Group
1
.
- In UI: Create Poller Group
-
Simulate a second poller instance configuration:
-
Make a copy of your
A better way is to use the same codebase but a different config file for a separate poller process./opt/librenms
directory (e.g.,/opt/librenms_poller2
). This is messy and not recommended for real use.poller-wrapper.py
is designed for this. The cron job forpoller-wrapper.py
can spawn multiple poller threads/processes based on CPU cores and configuration. Let's focus on thepoller-wrapper.py
approach which is standard. -
Using
poller-wrapper.py
and its native multi-poller capabilities:- LibreNMS
poller-wrapper.py
script (run by cron) can manage multiple poller processes/threads on a single machine. - Edit
/opt/librenms/config.php
to define poller group for the current instance if it's not default 0, or rely onpoller_id
inpollers
table. - The
poller-wrapper.py
itself handles running pollers for devices assigned to groups that this poller instance is part of. - To truly have a "distributed" poller, it needs to be on a separate host.
- LibreNMS
-
Conceptual Configuration for a Separate Poller (if it were on another machine): If you had another machine, you'd set its
Then its cron job would run pollers for devices in group 1./opt/librenms/config.php
to point to the central DB andrrdcached
, and assign it a uniquepoller_name
andpoller_group
.
-
Key Takeaway for the Workshop:
The core is understanding that:
- Distributed pollers are separate LibreNMS poller instances (code + cron).
- They connect to a central database and often a central
rrdcached
. - They are assigned a
poller_group
in theirconfig.php
. - Devices in the UI are assigned to these poller groups.
- Firewall rules must allow communication between pollers and the central services.
For practical experience without multiple VMs:
- Ensure
poller-wrapper.py
is used by your cron job (this is default in newer installs). - Go to Gear Icon > Pollers > Poller Groups and create a group (e.g., group
1
). - Edit one of your devices and assign it to this new poller group
1
. - Go to Gear Icon > Pollers > Pollers. Your main poller (e.g., your server's hostname) will be listed, likely associated with several groups including the default (e.g.,
-1
or0
) and any new ones you add devices to if no specific poller is configured for that group. - If you were to add another poller server, it would register itself with its name and group, and then only pollers in that specific group would handle devices assigned to it.
This conceptual understanding is more feasible for a typical workshop environment than a full multi-VM setup unless dedicated lab resources are available. The main point is to grasp the configuration and data flow.
11. Security Best Practices
Securing your LibreNMS installation is paramount, as it contains sensitive information about your network infrastructure and credentials to access monitored devices.
Securing the Web Interface (HTTPS, Authentication)
-
HTTPS (SSL/TLS):
- Always use HTTPS for the LibreNMS web interface to encrypt traffic between users and the server. This protects login credentials and all viewed data.
- Obtain an SSL Certificate:
- Let's Encrypt (Recommended for public servers): Free, automated certificates. Use tools like
certbot
. - Commercial Certificates: Purchase from a Certificate Authority (CA).
- Self-Signed Certificates: Can be used for internal-only access, but browsers will show warnings. Not recommended if users access from outside a trusted zone.
- Let's Encrypt (Recommended for public servers): Free, automated certificates. Use tools like
- Configure Web Server for HTTPS:
- Nginx:
certbot
usually handles this. Manually, you'd modify your Nginx virtual host:listen 443 ssl http2; listen [::]:443 ssl http2; ssl_certificate /etc/letsencrypt/live/your.librenms.domain.com/fullchain.pem; ssl_certificate_key /etc/letsencrypt/live/your.librenms.domain.com/privkey.pem; # Add other SSL hardening options (protocols, ciphers, HSTS) ssl_protocols TLSv1.2 TLSv1.3; ssl_prefer_server_ciphers off; # Or on with a strong cipher list ssl_ciphers 'ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384'; add_header Strict-Transport-Security "max-age=63072000; includeSubDomains; preload";
- Apache: Similar configuration using
SSLEngine on
and specifying certificate paths.
- Nginx:
- HTTP to HTTPS Redirection: Configure your web server to automatically redirect all HTTP requests to HTTPS.
-
Strong Authentication:
- User Passwords: Enforce strong, unique passwords for all LibreNMS user accounts.
- Two-Factor Authentication (2FA): Highly recommended. LibreNMS supports 2FA (e.g., TOTP with Google Authenticator). Users can enable it in their profile settings. Admins can enforce it globally via
config.php
: - Centralized Authentication (LDAP/RADIUS/SAML):
- Integrate with existing identity providers like Active Directory (LDAP), FreeIPA (LDAP), or SAML providers.
- This centralizes user management, password policies, and can enforce organizational security standards.
- Configuration is done in Global Settings > Authentication.
- Access Control:
- Use LibreNMS user roles (Administrator, Normal User, Global Read) appropriately.
- Limit administrator access to only those who absolutely need it.
- Use device groups to restrict what normal users can see and manage.
- Web Application Firewall (WAF):
- Consider placing a WAF (like ModSecurity with Nginx/Apache, or a cloud WAF) in front of LibreNMS to protect against common web attacks (SQL injection, XSS), although LibreNMS itself is generally well-written to prevent these.
Hardening the Operating System
The underlying OS of your LibreNMS server (and any distributed pollers) must be secured.
- Minimize Attack Surface:
- Install only necessary packages. Start with a minimal server install.
- Disable or remove unused services and daemons.
- Close unneeded network ports using a firewall.
- Firewall:
- Use a host-based firewall (e.g.,
ufw
on Ubuntu,firewalld
on CentOS/RHEL). - Allow only essential inbound traffic:
- SSH (TCP 22) - preferably restricted to trusted IPs.
- HTTP (TCP 80) - if used for Let's Encrypt validation or redirecting to HTTPS.
- HTTPS (TCP 443) - for the web UI.
- SNMP (UDP 161) - if monitoring the LibreNMS server itself.
- SNMP Traps (UDP 162) - if LibreNMS is configured to receive traps.
- Database port (TCP 3306) - if accessed by remote pollers.
rrdcached
port - if accessed by remote pollers.- Syslog port (UDP/TCP 514) - if receiving syslog.
- Use a host-based firewall (e.g.,
- Regular Updates:
- Keep the OS and all installed packages up-to-date with security patches.
sudo apt update && sudo apt upgrade -y
(Ubuntu/Debian)- Configure automatic updates for security patches if appropriate for your policy.
- Secure SSH:
- Disable root login:
PermitRootLogin no
in/etc/ssh/sshd_config
. - Use key-based authentication instead of passwords.
- Change the default SSH port (security by obscurity, but can reduce automated scans).
- Use Fail2Ban or
sshguard
to block IPs that attempt brute-force SSH logins.
- Disable root login:
- User Accounts:
- Use non-root users with
sudo
for administration. - Ensure the
librenms
user has minimal necessary privileges and a non-login shell if it doesn't need to log in directly (sudo usermod -s /usr/sbin/nologin librenms
). (Note: some scripts might require a shell, so test carefully./bin/bash
is often set forlibrenms
user).
- Use non-root users with
- Intrusion Detection/Prevention (IDS/IPS):
- Consider host-based IDS like
AIDE
(Advanced Intrusion Detection Environment) for file integrity monitoring orOSSEC
/Wazuh
.
- Consider host-based IDS like
- Logging and Auditing:
- Ensure system logs (
/var/log/syslog
,/var/log/auth.log
) are regularly monitored or forwarded to a central SIEM. - Enable auditd for more detailed system call auditing if needed.
- Ensure system logs (
Database Security
- Strong Passwords: Use strong, unique passwords for the MariaDB/MySQL root user and the
librenms
database user. - Network Access:
- By default, MariaDB/MySQL often binds to
127.0.0.1
(bind-address = 127.0.0.1
inmy.cnf
). If your database is on the same server as LibreNMS and no remote pollers need access, this is the most secure. - If remote pollers need database access, bind to the specific internal IP address and use firewall rules to restrict access to only the IPs of your LibreNMS server and poller machines.
- By default, MariaDB/MySQL often binds to
- User Privileges:
- The
librenms
database user should only have privileges on thelibrenms
database itself (GRANT ALL PRIVILEGES ON librenms.* TO ...
). Avoid granting global privileges.
- The
- Encryption:
- Consider enabling SSL/TLS for connections to the database server if it's accessed over an untrusted network (e.g., between remote pollers and the central DB).
- Explore MariaDB/MySQL Transparent Data Encryption (TDE) for data-at-rest encryption if required by compliance.
- Regular Backups: Essential (covered below).
- Remove Test Database and Anonymous Users:
mysql_secure_installation
usually handles this.
SNMPv3 Configuration for Enhanced Security
SNMPv1 and v2c rely on community strings (plain text passwords), which are insecure. SNMPv3 provides authentication and encryption.
- Benefits of SNMPv3:
- Authentication: Verifies the identity of the NMS and the agent.
- Encryption (Privacy): Protects data in transit from eavesdropping.
- Message Integrity: Ensures messages are not tampered with.
- Configuring SNMPv3 on a Managed Device (e.g., Linux
snmpd
):- Stop
snmpd
service. - Create SNMPv3 User: Use
net-snmp-create-v3-user
utility or manually editsnmpd.conf
and/var/lib/snmp/snmpd.conf
(or/var/net-snmp/snmpd.conf
).- Example using
net-snmp-create-v3-user
(this command might not be available on all systems or might be part of a sub-package):# This command often modifies /var/lib/snmp/snmpd.conf directly # sudo net-snmp-create-v3-user -ro -A YourAuthPassword -X YourPrivPassword -a SHA -x AES snmpv3user # -ro: read-only user # -A: Authentication password # -X: Privacy (encryption) password # -a: Authentication protocol (SHA or MD5) # -x: Privacy protocol (AES or DES) # snmpv3user: the username
- Manual
snmpd.conf
configuration: In/etc/snmp/snmpd.conf
:Note: The# First, if you don't have a master agentx socket, you may need this line # master agentx # Create a read-only user with authentication (SHA) and privacy (AES) # Replace YourAuthPassword and YourPrivPassword with strong, unique passwords # Minimum password length is 8 characters for net-snmp. createUser snmpv3user SHA "YourAuthPassword" AES "YourPrivPassword" # Grant read-only access to this user for the entire MIB tree rouser snmpv3user priv .1 # 'priv' means both authentication and privacy are required. # 'auth' would mean only authentication is required (no encryption). # '.1' gives access to the entire MIB tree starting from .iso(1)
createUser
directive in/etc/snmp/snmpd.conf
is processed once at startup to populate a persistent configuration, usually in/var/lib/snmp/snmpd.conf
or/var/net-snmp/snmpd.conf
. After the first start,snmpd
reads user configurations from that persistent file. You might need to remove thecreateUser
line from/etc/snmp/snmpd.conf
after the user is created to avoid errors on subsequent restarts, or manage users directly in the persistent file.
- Example using
- Start
snmpd
service. - Test SNMPv3 from LibreNMS server:
snmpwalk -v3 -l authPriv -u snmpv3user -a SHA -A YourAuthPassword -x AES -X YourPrivPassword TARGET_DEVICE_IP system # -l authPriv: Security level (authentication and privacy) # -u snmpv3user: Username # -a SHA: Authentication protocol # -A YourAuthPassword: Authentication password # -x AES: Privacy protocol # -X YourPrivPassword: Privacy password
- Stop
- Adding SNMPv3 Device in LibreNMS:
- When adding or editing a device in LibreNMS UI:
- Select SNMP Version:
v3
. - Auth Level: Choose
authPriv
,authNoPriv
, ornoAuthNoPriv
. - Auth Username:
snmpv3user
. - Auth Algo:
SHA
(orMD5
). - Auth Password:
YourAuthPassword
. - Crypto Algo:
AES
(orDES
). - Crypto Password:
YourPrivPassword
.
- Select SNMP Version:
- When adding or editing a device in LibreNMS UI:
Transitioning to SNMPv3 significantly improves the security of your monitoring data. It's more complex to set up but highly recommended.
Regular Backups and Disaster Recovery
Data loss can be catastrophic. Regular backups of your LibreNMS server are essential.
-
What to Back Up:
- LibreNMS Database: This contains device configurations, alert rules, user accounts, event history, etc.
- RRD Files: These store all your historical graph data.
- Located in
/opt/librenms/rrd/
. - Can be backed up using
rsync
,tar
, or filesystem snapshots. RRD files can be numerous and take up space.Note: Backing up live RRD files can sometimes lead to slightly inconsistent files if writes occur during backup. Stoppingsudo rsync -avz /opt/librenms/rrd/ /backup_path/rrd_backup/ # Or tar: # sudo tar -czvf /backup_path/librenms_rrd_backup_$(date +%Y%m%d).tar.gz /opt/librenms/rrd
rrdcached
and pollers during RRD backup is safest but causes a monitoring gap. Alternatively, if using LVM, take an LVM snapshot and back up from the snapshot.
- Located in
- LibreNMS Configuration Files:
/opt/librenms/.env
: Contains database credentials and other critical settings./opt/librenms/config.php
: Your main custom configuration.- Custom OS definitions, MIBs, poller/discovery scripts, application monitor definitions if you created them.
- Web server configuration (Nginx/Apache virtual hosts).
- PHP configuration.
- Cron jobs (
/etc/cron.d/librenms
).
- The LibreNMS Application Code (
/opt/librenms/
): While this can be re-cloned from Git, backing up your specific version along with local modifications (if any, though not recommended for core files) can be useful.
-
Backup Strategy:
- Frequency:
- Database: Daily or more frequently for critical systems.
- RRDs: Daily or weekly (depends on how much historical graph data loss you can tolerate). RRDs are less critical than the DB for immediate operational recovery but vital for historical trends.
- Configuration files: After every significant change, and regularly (e.g., daily).
- Retention: Keep multiple backup versions (e.g., daily for a week, weekly for a month, monthly for a year).
- Location: Store backups on a separate physical server, NAS, or cloud storage (e.g., S3). Follow the 3-2-1 backup rule (3 copies, 2 different media, 1 offsite).
- Automation: Use cron jobs and scripting to automate backups.
- Testing: Regularly test your backup restoration process to ensure backups are valid and you know how to recover. This is the most crucial and often overlooked step.
- Frequency:
-
Disaster Recovery Plan:
- Document the steps to rebuild your LibreNMS server from scratch using your backups.
- Include OS installation, dependency setup, restoring database, RRDs, and configurations.
- Consider how long the recovery process will take (Recovery Time Objective - RTO) and how much data loss is acceptable (Recovery Point Objective - RPO).
Workshop Implementing HTTPS and SNMPv3
Objective:
To secure the LibreNMS web interface with a Let's Encrypt HTTPS certificate and reconfigure one monitored device (e.g., localhost
) to use SNMPv3.
Prerequisites:
- LibreNMS installed and accessible via HTTP.
- The LibreNMS server must be publicly accessible on ports 80 and 443 from the internet for Let's Encrypt validation using HTTP-01 or TLS-ALPN-01 challenge (unless using DNS-01 challenge which is more complex). If your server is not public, you can only do the SNMPv3 part or use a self-signed certificate for HTTPS (which will generate browser warnings).
- A registered domain name pointing to your LibreNMS server's public IP address. (e.g.,
librenms.yourdomain.com
). certbot
installed (as shown in theory section).
Part 1: Securing Web Interface with Let's Encrypt HTTPS
(Skip this part if your server is not publicly accessible or you don't have a domain name. You can still learn from the steps.)
- Ensure DNS is Set Up:
- Your domain (e.g.,
librenms.yourdomain.com
) must resolve to the public IP address of your LibreNMS server.
- Your domain (e.g.,
- Install Certbot (if not already done):
- Assuming Nginx:
- Stop Nginx Temporarily (Optional, Certbot can often work with it running, but sometimes stopping helps for initial setup if port 80 is heavily used):
- Obtain and Install Certificate:
- Replace
librenms.yourdomain.com
with your actual domain and provide your email. - Certbot will ask if you want to redirect HTTP traffic to HTTPS. Choose option 2 (Redirect).
- If successful, Certbot will configure Nginx to use the SSL certificate and set up automatic renewal.
- Replace
- Restart Nginx (if you stopped it manually, or Certbot might do it):
- Verify HTTPS:
- Open your web browser and navigate to
https://librenms.yourdomain.com
. - You should see a padlock icon indicating a secure connection.
- Try accessing via
http://librenms.yourdomain.com
– it should automatically redirect to HTTPS.
- Open your web browser and navigate to
- Check Auto-Renewal:
- Certbot sets up a cron job or systemd timer for renewal. Test it:
- This should complete without errors.
Part 2: Reconfiguring localhost
(LibreNMS Server) for SNMPv3
-
Configure
snmpd
on LibreNMS Server for SNMPv3:- Log in to your LibreNMS server via SSH.
- Stop
snmpd
: - Edit
/etc/snmp/snmpd.conf
. Remove or comment out existingrocommunity
orrwcommunity
lines forlocalhost
if they conflict. Add the following (replace passwords with your own strong ones, min 8 chars):(As noted before, after# master agentx # Uncomment if needed and not already present createUser librenms_v3user SHA "Str0ngAuthP@sswOrd" AES "Str0ngPrivP@sswOrd" rouser librenms_v3user priv .1
snmpd
starts once and creates the user in its persistent store (e.g.,/var/lib/snmp/snmpd.conf
), you might remove/comment thecreateUser
line from/etc/snmp/snmpd.conf
to prevent errors on subsequent restarts, or simply ignore the startup warnings if they occur. The user will persist.) - Start
snmpd
:
-
Test SNMPv3 Locally:
You should get SNMP output. If not, troubleshootsnmpwalk -v3 -l authPriv -u librenms_v3user -a SHA -A "Str0ngAuthP@sswOrd" -x AES -X "Str0ngPrivP@sswOrd" localhost system
snmpd.conf
, passwords, or checkjournalctl -u snmpd
for errors. -
Update
localhost
Device in LibreNMS UI for SNMPv3:- Navigate to your LibreNMS instance (now via HTTPS if you did Part 1).
- Go to Devices > All Devices, click on
localhost
. - Click the Edit icon (Cog).
- Go to the SNMP tab (or section).
- SNMP Version: Select
v3
. - Auth Level:
authPriv
. - Auth Username:
librenms_v3user
. - Auth Algo:
SHA
. - Auth Password:
Str0ngAuthP@sswOrd
. - Crypto Algo:
AES
. (Ensure it's AES-128, which is typical. Ifsnmpd
uses AES-192 or AES-256, ensure LibreNMS choice matches. Default AES in net-snmp is AES-128). - Crypto Password:
Str0ngPrivP@sswOrd
. - SNMP Port / Timeout / Retries / etc.: Leave as default unless changed on
snmpd
. - Click Save Changes (or "Update Device").
-
Verify Polling:
- Wait for the next polling cycle (up to 5 minutes).
- The
localhost
device page should continue to update with new data. - Check the Event Log for
localhost
. You should see successful SNMPv3 polling events. If you see errors, double-check all SNMPv3 parameters in LibreNMS and on thesnmpd
configuration. Pay close attention to passwords and Auth/Crypto algorithms.
Deliverables/Reflection:
- LibreNMS web interface accessible via HTTPS with a valid certificate (if Part 1 was feasible).
- The
localhost
device successfully monitored by LibreNMS using SNMPv3. - Understanding of the steps to configure SNMPv3 on a Linux host and update device settings in LibreNMS.
This workshop enhances the security of your LibreNMS setup significantly by encrypting web traffic and SNMP communication for at least one device. You can apply the SNMPv3 process to other monitored devices as well.
12. Troubleshooting LibreNMS
Even with careful setup, you may encounter issues with LibreNMS. Knowing how to troubleshoot common problems is a vital skill. This involves understanding logs, using built-in debugging tools, and approaching problems systematically.
Common Installation Issues
-
PHP Version/Extension Mismatches:
- Symptom: Web installer fails, white screen of death in the browser, errors in web server logs referencing undefined PHP functions or classes.
- Cause: LibreNMS requires a specific minimum PHP version and a set of PHP extensions. If the installed PHP version is too old, or if critical extensions are missing or not enabled, the application will fail to run.
- Troubleshooting:
- Verify PHP Version: Open a terminal on your LibreNMS server and run
php -v
. Compare this version with the required version stated in the official LibreNMS installation documentation. - Check Loaded PHP Extensions: Run
php -m
. This lists all compiled and loaded PHP modules. Cross-reference this list with the required extensions in the LibreNMS documentation (e.g.,gd
,mysql
ormysqli
orpdo_mysql
,snmp
,xml
,mbstring
,tokenizer
,json
,curl
,zip
,bcmath
,gmp
,intl
). - Check Web Server PHP Configuration: Ensure that the PHP version and extensions used by your web server (Nginx with PHP-FPM, or Apache with
mod_php
or PHP-FPM) are the same as the CLI. Sometimes, differentphp.ini
files are used for CLI and FPM/web.- For PHP-FPM, check the FPM pool configuration (e.g.,
/etc/php/YOUR_VERSION/fpm/pool.d/www.conf
or a dedicatedlibrenms.conf
) and the mainphp.ini
for FPM (e.g.,/etc/php/YOUR_VERSION/fpm/php.ini
).
- For PHP-FPM, check the FPM pool configuration (e.g.,
- Examine Web Server Error Logs: These are invaluable.
- Nginx: Typically
/var/log/nginx/error.log
(and possibly a site-specific error log like/var/log/nginx/librenms.error.log
). - Apache: Typically
/var/log/apache2/error.log
or/var/log/httpd/error_log
. These logs will often contain specific PHP errors pointing to the missing function or extension.
- Nginx: Typically
- Verify PHP Version: Open a terminal on your LibreNMS server and run
- Fix:
- If PHP version is incorrect, upgrade/downgrade PHP using your OS package manager (you might need to use third-party repositories like
ppa:ondrej/php
for Ubuntu to get specific versions). - If extensions are missing, install them using your package manager (e.g.,
sudo apt install php-your-version-snmp php-your-version-gd
). - After installing/enabling extensions or changing PHP versions, restart PHP-FPM (e.g.,
sudo systemctl restart phpYOUR_VERSION-fpm
) and your web server (e.g.,sudo systemctl restart nginx
).
- If PHP version is incorrect, upgrade/downgrade PHP using your OS package manager (you might need to use third-party repositories like
-
Database Connection Failures:
- Symptom: Web installer cannot connect to the database; errors during
./validate.php
execution; the LibreNMS UI shows "Database connection error" or similar messages; pollers fail with database errors inlibrenms.log
. - Cause: Incorrect database credentials in
.env
, database server not running, firewall blocking the database port (default 3306 for MySQL/MariaDB), incorrectbind-address
in the database server configuration preventing connections from the LibreNMS host, or the database user lacks permissions. - Troubleshooting:
- Verify Database Credentials: Check the
/opt/librenms/.env
file. EnsureDB_HOST
,DB_DATABASE
,DB_USERNAME
, andDB_PASSWORD
are correct. - Check Database Service Status: Ensure MariaDB/MySQL is running:
sudo systemctl status mariadb
(ormysql
). If not running, try to start it:sudo systemctl start mariadb
. Check its logs (journalctl -u mariadb
or MySQL error log) if it fails to start. - Test Manual Database Connection: From the LibreNMS server command line, try to connect to the database using the
mysql
client with the same credentials: If this fails, the issue is likely with credentials, host, or user permissions. - Check
bind-address
: In your MariaDB/MySQL configuration file (e.g.,/etc/mysql/mariadb.conf.d/50-server.cnf
), thebind-address
directive controls which network interfaces the database server listens on.- If it's
127.0.0.1
, it only accepts connections fromlocalhost
. This is fine if LibreNMS and the database are on the same server. - If LibreNMS (or distributed pollers) are on different hosts,
bind-address
must be0.0.0.0
(listen on all interfaces) or the specific IP address of the interface LibreNMS connects through.
- If it's
- Firewall Rules: Ensure your firewall (on the DB server if separate, or on the LibreNMS server if it's local) allows connections to the database port (e.g., TCP 3306) from the LibreNMS application server/pollers.
- Database User Permissions: Verify that the
librenms
database user has the necessary privileges on thelibrenms
database from the correct host(s). Log into MySQL as root:It should haveSHOW GRANTS FOR 'librenms'@'localhost'; -- If connecting from another host: -- SHOW GRANTS FOR 'librenms'@'your_librenms_app_server_ip';
ALL PRIVILEGES
on thelibrenms
database. If not, grant them:
- Verify Database Credentials: Check the
- Fix: Correct credentials in
.env
, start the DB server, adjustbind-address
and firewall, or fix DB user grants. Restart relevant services.
- Symptom: Web installer cannot connect to the database; errors during
-
File Permissions Issues:
- Symptom: Web UI shows errors about being unable to write to directories (logs, RRDs, cache); graphs are not updating; images/CSS/JS fail to load;
./validate.php
reports permission errors. - Cause: Incorrect ownership or permissions for LibreNMS directories (
/opt/librenms/rrd
,/opt/librenms/logs
,/opt/librenms/bootstrap/cache/
,/opt/librenms/storage/
). Thelibrenms
user and the web server user (e.g.,www-data
ornginx
) need appropriate access. - Troubleshooting:
- Run
./validate.php
from/opt/librenms/
. It often detects and suggests fixes for permission issues. - Carefully review the ownership and permissions steps in the official LibreNMS installation guide for your OS. Common commands involve
chown -R librenms:librenms /opt/librenms
andsetfacl
to grant group write access to specific subdirectories for the web server user (which should be in thelibrenms
group).# Example ownership/permissions (consult official docs for exact, up-to-date commands) sudo chown -R librenms:librenms /opt/librenms sudo setfacl -d -m g::rwx /opt/librenms/rrd /opt/librenms/logs /opt/librenms/bootstrap/cache/ /opt/librenms/storage/ sudo setfacl -R -m g::rwx /opt/librenms/rrd /opt/librenms/logs /opt/librenms/bootstrap/cache/ /opt/librenms/storage/ # Ensure web server user (e.g., www-data) is in the librenms group: # sudo usermod -a -G librenms www-data
- Run
- Fix: Apply the correct ownership and permissions as per the documentation. You might need to clear cache:
sudo -u librenms php artisan cache:clear
andsudo -u librenms php artisan config:clear
.
- Symptom: Web UI shows errors about being unable to write to directories (logs, RRDs, cache); graphs are not updating; images/CSS/JS fail to load;
-
Web Server Configuration Errors (Nginx/Apache):
- Symptom: 403 Forbidden, 404 Not Found, 500 Internal Server Error when accessing LibreNMS UI; PHP code displayed instead of being executed.
- Cause: Incorrect web server virtual host configuration (wrong
root
directory, incorrect PHP processing setup, alias issues, rewrite rules missing). - Troubleshooting:
- Compare your Nginx/Apache site configuration file for LibreNMS with the example provided in the official LibreNMS documentation. Pay close attention to:
root
directive (should point to/opt/librenms/html
).server_name
.- PHP block (
location ~ \.php$
for Nginx, orSetHandler
/ProxyPassMatch
for Apache with PHP-FPM) ensuring it correctly passes PHP files to the PHP-FPM socket ormod_php
. index
directive (should includeindex.php
).- Any rewrite rules specified by LibreNMS.
- Test web server configuration:
sudo nginx -t
orsudo apachectl configtest
. - Check web server error logs and access logs for more details.
- Compare your Nginx/Apache site configuration file for LibreNMS with the example provided in the official LibreNMS documentation. Pay close attention to:
- Fix: Correct the web server configuration file, then restart the web server.
-
Cron Job Not Running or Misconfigured:
- Symptom: Devices not being polled (graphs flatline, "Last Polled" time doesn't update); new devices not discovered; alerts not triggering;
daily.sh
tasks not running../validate.php
might warn about pollers ordaily.sh
not running. - Cause: The LibreNMS cron entry (
/etc/cron.d/librenms
) is missing, commented out, has incorrect syntax, or the cron daemon itself is not running. The user specified in the cron job might not have permissions to execute LibreNMS scripts. - Troubleshooting:
- Verify the cron file exists:
cat /etc/cron.d/librenms
. It should contain lines similar to:(Note: Modern LibreNMS often uses a single cron entry running# Example, check official docs * * * * * librenms /opt/librenms/cronic /opt/librenms/poller-wrapper.py */5 * * * * librenms /opt/librenms/discovery-wrapper.py 1 @daily librenms /opt/librenms/daily.sh >> /dev/null 2>&1 # ... and others
laravel-scheduler
every minute, which then manages all sub-tasks like polling and discovery:* * * * * librenms /opt/librenms/laravel-scheduler.php >> /dev/null 2>&1
) - Check cron daemon status:
sudo systemctl status cron
. - Check system logs (e.g.,
/var/log/syslog
orjournalctl -u cron
) for messages related to cron jobs being run (or failing). Search for "librenms" or the script names. - Ensure the user in the cron file (usually
librenms
) can execute the scripts and has the correct environment (e.g.,PATH
). - Manually run the scripts as the
librenms
user to see if they execute without error:
- Verify the cron file exists:
- Fix: Create or correct the
/etc/cron.d/librenms
file using the example from LibreNMS documentation. Ensure cron daemon is running. Fix script permissions if necessary.
- Symptom: Devices not being polled (graphs flatline, "Last Polled" time doesn't update); new devices not discovered; alerts not triggering;
Device Polling/Discovery Failures
- Symptom: Device status is "Down (SNMP)" or "Down (Ping)"; no data appears for a newly added device; device information (interfaces, sensors) is incomplete or outdated.
- Cause:
- SNMP misconfiguration on the target device (wrong community string, SNMP agent not running, version mismatch, ACLs blocking LibreNMS IP).
- Firewall blocking SNMP (UDP 161) or ICMP (ping) between LibreNMS and the target device.
- Network connectivity issues.
- LibreNMS poller/discovery modules for that OS/device are disabled or have errors.
- Device added with incorrect SNMP parameters in LibreNMS.
-
Troubleshooting:
- Basic Connectivity (Ping): From the LibreNMS server, try to ping the target device: If ping fails, resolve network connectivity or firewall issues first.
- SNMP Walk: This is the most crucial test. From the LibreNMS server command line, attempt an
snmpwalk
to the target device using the exact same credentials and version configured in LibreNMS for that device:- For SNMPv2c:
- For SNMPv3:
If
snmpwalk -v3 -l authPriv -u YOUR_USER -a SHA -A 'YOUR_AUTH_PASS' -x AES -X 'YOUR_PRIV_PASS' TARGET_DEVICE_IP system
snmpwalk
fails (timeout, no response, authentication failure):- Verify SNMP agent is running on the target device.
- Check SNMP configuration on the target device (community string, v3 user credentials, allowed IPs/ACLs for SNMP).
- Check firewalls on the target device and any intermediate network firewalls.
- LibreNMS Poller Debug: Run the poller for the specific device in debug mode: Look for errors related to SNMP, specific MIBs, or module execution.
- LibreNMS Discovery Debug: Run discovery for the specific device in debug mode: This will show what information discovery is gathering (or failing to gather).
- Check Device Settings in LibreNMS: Go to the device's Edit page in LibreNMS. Verify SNMP version, community/credentials, port, and selected poller/discovery modules are correct.
- Event Log: Check Logs > Event Log in LibreNMS for messages related to the device. Filter by the device hostname.
librenms.log
: Check/opt/librenms/logs/librenms.log
for more detailed error messages from backend processes.
-
Fix: Correct SNMP settings on the device or in LibreNMS, adjust firewalls, fix network issues, or address module-specific problems identified in debug output.
Graphing Issues (No Data, Broken Graphs)
- Symptom: Graphs show "No Data," are empty, display as broken images, or stop updating.
- Cause:
- Polling for the device/metric is failing (see above).
- RRD files are not being created or updated (permission issues,
rrdcached
problems). - RRDtool itself is not installed correctly or has issues.
- Graph definition errors (for custom graphs).
- Incorrect RRD file paths or names.
- Time synchronization issues between the poller and the web server (can affect graph rendering if RRDs seem to be in the future).
-
Troubleshooting:
- Verify Polling: First, ensure the device and the specific metrics are being polled successfully (use poller debug). If no data is collected, RRDs won't be updated.
- Check RRD Files:
- Locate the RRD file for the problematic graph. Paths are typically
/opt/librenms/rrd/HOSTNAME/METRIC.rrd
(e.g.,/opt/librenms/rrd/my-server/cpu-system.rrd
). - Check file existence, ownership (
librenms:librenms
), and last modification time (ls -l
). If not updated recently, polling/RRD writing is the issue. - Use
rrdtool info METRIC.rrd
to inspect its structure and last update time. - Use
rrdtool fetch METRIC.rrd AVERAGE -s -10m
to see recent data points.
- Locate the RRD file for the problematic graph. Paths are typically
rrdcached
Issues:- Ensure
rrdcached
is running:sudo systemctl status rrdcached
. - Check its configuration in
/etc/default/rrdcached
and LibreNMSconfig.php
($config['rrdcached']
). - Check
rrdcached
logs (if configured) or system logs for errors. - Try restarting
rrdcached
. - Permissions:
rrdcached
user must be able to write to the/opt/librenms/rrd
directory and its journal directory.
- Ensure
- RRDtool Version: In LibreNMS Global Settings > System > General, ensure the selected RRDtool version matches the installed version (
rrdtool -v
). - Web Server Logs: Check Nginx/Apache error logs for errors related to graph generation (e.g.,
rrdtool
command failures). - Permissions for Graph Generation: The web server user (e.g.,
www-data
) needs read access to RRD files to generate graphs. Ifrrdcached
is used for reads too, then PHP needs to connect torrdcached
. - Time Synchronization: Ensure NTP is configured and working on the LibreNMS server (and pollers if distributed). Significant time drift can cause RRDtool issues.
- Broken Graph Image Icon: If you see a broken image icon, right-click and "Open image in new tab" or "Inspect element" to see the URL that failed. This URL is often a call to
graph.php
or similar. Try accessing it directly. The output might show an RRDtool error message. ./validate.php
: Run this to check for common configuration or path issues.
-
Fix: Resolve polling issues, fix RRD file permissions, correct
rrdcached
setup, ensure RRDtool is working, or fix graph definitions.
Alerting Problems
- Symptom: Alerts are not being triggered when conditions are met; notifications (email, Slack, etc.) are not being sent; too many false positive alerts.
- Cause:
- Alert rule misconfiguration (incorrect conditions, thresholds, device association).
- Polling for the relevant metric is failing.
- Alert transports (email, Slack) are not configured correctly or failing.
alerter.php
(or the alerting component of the scheduler) is not running or has errors.- Delay settings in alert rules preventing immediate firing.
- Timezone issues affecting scheduled checks or delays.
-
Troubleshooting:
- Verify Metric Data: First, confirm that the data point the alert rule is based on is being polled correctly and is visible in graphs with the expected problematic value. If the data isn't there, the alert can't trigger.
- Review Alert Rule Configuration:
- Go to Alerts > Alert Rules, edit the problematic rule.
- Double-check conditions, values, device/group associations.
- Check the "Delay" setting – an alert only fires after the condition has been true for this duration.
- Check if the rule is enabled.
- Check Alert Transports:
- Go to Alerts > Transports.
- Use the "Test Transport" button for the configured transport.
- For email: ensure your server can send mail (check mail logs like
/var/log/mail.log
or Postfix/Exim logs). Verify SMTP settings inconfig.php
if using an external relay. - For Slack/Telegram etc.: verify API tokens, webhook URLs, and channel IDs.
- Alert Test Rule: Create a very simple test alert rule that is easy to trigger (e.g., based on a sensor you can manipulate or a device you can temporarily make "down" by stopping its SNMP agent).
- Check Alert Log and Event Log:
- Alerts > Alert Log: Shows triggered alerts and their history.
- Logs > Event Log: Search for events related to the device and alert rule. It might show attempts to send notifications or errors.
- Alerter Process:
- The alerting logic runs as part of the scheduled tasks. Ensure your cron jobs are running correctly.
- Check
/opt/librenms/logs/librenms.log
for any errors related to alerting or the scheduler.
- Test Alert Rule from CLI (Advanced):
You can use
./scripts/test-alert.php
to test specific alert rules against devices. This script will show you the data it evaluated and why the rule did or did not trigger. - False Positives: If getting too many alerts, review rule thresholds, increase delay times, or refine conditions to be more specific. Use "Alert if condition is true for X minutes" to avoid alerts for transient spikes.
-
Fix: Correct alert rule logic, fix transport configurations, ensure polling and cron jobs are working.
Performance Bottlenecks
- Symptom: Web UI is slow; pollers take longer than the polling interval (e.g., > 5 minutes); high CPU/memory/IO load on the LibreNMS server.
- Cause: Insufficient server resources (CPU, RAM, disk I/O); unoptimized database; too many devices/pollers for a single instance; inefficient custom scripts or modules.
-
Troubleshooting:
- Monitor Server Resources: Use
top
,htop
,vmstat
,iostat
,iotop
on the LibreNMS server to identify CPU, memory, or I/O bottlenecks. - Poller Performance:
- Check Health > Poller Performance in the UI.
- Identify which devices or modules are taking the longest to poll.
- If total poller time exceeds the interval (e.g., 300 seconds), you have a problem.
- Disable unused poller modules globally (Global Settings > Polling > Global Modules) or per-device.
- Database Performance:
- Use
mysqltuner.pl
for recommendations. - Enable and analyze the slow query log.
- Optimize MariaDB/MySQL configuration (
innodb_buffer_pool_size
, etc.) as detailed in the Performance Tuning section. - Ensure you have enough RAM for the InnoDB buffer pool.
- Use
rrdcached
: Ensure it's running and configured correctly. Slow RRD writes can cripple poller performance.- Web Server Performance:
- Tune PHP-FPM
pm.max_children
and other pool settings. - Optimize Nginx/Apache configuration (worker processes, keepalives).
- Check web server and PHP-FPM logs.
- Tune PHP-FPM
- Number of Devices: If monitoring a very large number of devices, consider:
- Distributed Pollers: To scale out polling load.
- More powerful hardware for the central server and database.
- Network Latency: High latency to many monitored devices can slow down pollers.
- LibreNMS Updates: Ensure you are running a recent version of LibreNMS, as performance improvements are often made.
- Custom Code: If you have custom poller modules or scripts, profile them to ensure they are efficient.
- Monitor Server Resources: Use
-
Fix: Allocate more resources, tune database/webserver/PHP-FPM, implement distributed pollers, disable unnecessary polling.
General Debugging Tools and Logs
-
This script checks many common configuration issues, file permissions, database schema, and dependencies. Always start here../validate.php
: -
LibreNMS Log File:
/opt/librenms/logs/librenms.log
- This is the main application log. Contains detailed information from pollers, discovery, alerting, API, etc.
- Increase log level in
.env
(e.g.,APP_LOG_LEVEL=debug
) for more verbose output during troubleshooting (remember to set it back toinfo
orwarning
in production).
-
Poller and Discovery Debug Output (CLI):
./poller.php -h <device> -d -r -f
./discovery.php -h <device> -d -m <module>
- These provide real-time output of what these scripts are doing.
-
Device Event Log (UI): Logs > Event Log (filter by device). Shows device status changes, polling errors, alert triggers for that device.
-
Global Event Log (UI): Logs > Event Log. Shows system-wide events.
-
Web Server Logs:
- Nginx:
/var/log/nginx/access.log
and/var/log/nginx/error.log
. - Apache:
/var/log/apache2/access.log
and/var/log/apache2/error.log
.
- Nginx:
-
PHP-FPM Logs:
- Often in
/var/log/phpYOUR_VERSION-fpm.log
. Configured in PHP-FPM pool settings.
- Often in
-
Database Logs:
- MariaDB/MySQL error log (path varies, e.g.,
/var/log/mysql/error.log
or/var/lib/mysql/HOSTNAME.err
). - Slow query log (if enabled).
- MariaDB/MySQL error log (path varies, e.g.,
-
Cron Logs: System log (e.g.,
/var/log/syslog
orjournalctl -u cron
) often shows cron job execution.
Workshop Troubleshooting Common Scenarios
Objective: To practice diagnosing and potentially fixing common LibreNMS issues using the tools and techniques discussed. This workshop is more thought-based and investigative.
Prerequisites:
- A running LibreNMS instance.
- Ability to access the LibreNMS server command line.
- Familiarity with viewing logs and running basic commands.
Scenario 1: A Device Stops Graphing
- Symptom: Graphs for "Server-X" have flatlined for the past hour. Last polled time is old.
- Your Troubleshooting Steps (List them):
- Example: Check LibreNMS UI: Device status for Server-X? Any errors on its overview page?
- Example: Check Event Log in UI for Server-X around the time it stopped graphing. Any SNMP errors, timeouts?
- Example: SSH to LibreNMS server. Try
ping Server-X_IP
. - Example: Try
snmpwalk -v2c -c COMMUNITY Server-X_IP system
(using correct credentials). - Example: If snmpwalk fails, investigate Server-X: snmpd service running? Firewall? SNMP config correct?
- Example: If snmpwalk works, run poller debug:
./poller.php -h Server-X -d -r -f
. Look for errors. - Example: Check
/opt/librenms/logs/librenms.log
for errors related to Server-X or its polling. - Example: Check RRD file for Server-X:
/opt/librenms/rrd/Server-X/some_metric.rrd
. Last updated time? Permissions? - Example: Check
rrdcached
status and logs if it's in use.
Scenario 2: Email Alerts Not Being Received
- Symptom: You configured an alert rule for "Device Down" and an email transport. You simulated a device going down, the alert triggered in LibreNMS UI, but no email arrived.
- Your Troubleshooting Steps (List them):
- Example: In LibreNMS UI, go to Alerts > Transports. Click "Test Transport" for the email transport. Does the test succeed or fail?
- Example: Check your email spam/junk folder.
- Example: If test transport fails or no test email: check mail server configuration on LibreNMS host. Is Postfix/Exim running?
- Example: Check mail logs on LibreNMS server (e.g.,
/var/log/mail.log
orjournalctl -u postfix
). Any errors related to sending mail to your address? - Example: If using external SMTP relay in
config.php
, double-check all SMTP settings (host, port, user, password, security TLS/SSL). - Example: Check
/opt/librenms/logs/librenms.log
for errors when the actual alert tried to send a notification. - Example: Verify the alert rule is indeed associated with the correct (and tested) email transport.
- Example: Ensure the "Default from email address" and "Default contact email address" in Global Settings > Alerting > General are sensible.
Scenario 3: Web UI is Very Slow
- Symptom: Loading dashboards, device pages, or graphs in the LibreNMS web UI takes a very long time.
- Your Troubleshooting Steps (List them):
- Example: On LibreNMS server, run
top
orhtop
. Is CPU maxed out? Is memory full (swapping heavily)? High disk I/O wait (%wa
)? - Example: Identify processes consuming most resources. Is it
mysqld
,php-fpm
,nginx/apache
, orrrdcached
? - Example: If
mysqld
is high: Check MariaDB/MySQL error log. Enable slow query log. Runmysqltuner.pl
. - Example: If
php-fpm
is high: Check PHP-FPM error log. Are there enough child processes (pm.max_children
)? Is a specific PHP script consuming resources? - Example: Check web server logs for errors or long response times in access logs.
- Example: Check poller performance (Health > Poller Performance). Is the poller run taking too long, potentially impacting DB or server load?
- Example: How many devices are being monitored? Is the server undersized for the load?
- Example: Run
./validate.php
. Any warnings or errors? - Example: Check browser developer tools (Network tab) to see which requests are slow. Are they API calls, graph images, static assets?
- Example: On LibreNMS server, run
Deliverables/Reflection:
- For each scenario, a plausible list of troubleshooting steps in logical order.
- Increased understanding of where to look for clues when LibreNMS misbehaves.
Troubleshooting is often a process of elimination. By systematically checking logs, configurations, and using debugging tools, you can usually pinpoint the root cause of most LibreNMS issues. The LibreNMS community forums and documentation are also excellent resources when you get stuck.
Conclusion
Throughout this comprehensive guide, we've journeyed from the foundational concepts of LibreNMS, through basic and intermediate setup and management, to advanced customization, optimization, and troubleshooting. You've learned how to install LibreNMS, add devices using SNMP, configure alerts, navigate its interface, and extend its capabilities. We've also delved into crucial aspects like performance tuning with distributed pollers and database optimization, securing your installation, and systematically resolving common issues.
LibreNMS is a powerful and flexible open-source network monitoring system. By self-hosting it, you gain complete control over your monitoring data and environment, offering invaluable insights into the health and performance of your network and servers. The skills you've developed here – covering Linux server administration, network protocols, database management, and specific LibreNMS configurations – are highly transferable and valuable in any IT environment.
Monitoring is not a "set it and forget it" task. It requires ongoing attention, refinement of alert rules, adaptation to new devices and services, and regular maintenance of the LibreNMS platform itself. The workshops provided practical, hands-on experience, which is key to solidifying your understanding.
As you continue to use and explore LibreNMS, remember that it has a vibrant community. Don't hesitate to consult the official documentation, participate in forums, and explore the wealth of shared knowledge. The world of network monitoring is ever-evolving, and LibreNMS, with its active development, is well-positioned to adapt and grow.
You are now well-equipped to deploy, manage, and leverage LibreNMS effectively. Use this knowledge to build robust monitoring solutions, proactively identify and resolve IT issues, and contribute to the stability and efficiency of the networks and systems you manage.
Further Learning Resources
To continue your journey and deepen your expertise with LibreNMS, here are some valuable resources:
-
Official LibreNMS Documentation:
- URL: https://docs.librenms.org/
- This is the primary source of truth for installation, configuration, and feature explanations. It's regularly updated.
-
LibreNMS Community Forum:
- URL: https://community.librenms.org/
- An excellent place to ask questions, share solutions, and learn from other users' experiences. You can find discussions on specific devices, custom configurations, and troubleshooting.
-
LibreNMS GitHub Repository:
- URL: https://github.com/librenms/librenms
- Explore the source code, report bugs, submit feature requests, or even contribute to the project. The "Issues" and "Pull Requests" sections are insightful.
-
LibreNMS Discord/IRC:
- Check the community page for links to real-time chat channels (Discord is commonly used). Useful for quick questions and discussions.
-
Understanding SNMP:
- Search for "SNMP tutorial" or "Understanding MIBs and OIDs." A deeper knowledge of SNMP will greatly enhance your ability to customize LibreNMS.
- RFCs for SNMP (e.g., RFC 3411-3418 for SNMPv3) provide the authoritative specifications.
-
RRDtool Documentation:
- URL: https://oss.oetiker.ch/rrdtool/doc/index.en.html
- Understanding how RRDtool works is key to interpreting graphs and managing data storage.
-
MySQL/MariaDB Tuning Guides:
- Official MariaDB and MySQL documentation.
- Blogs from Percona and other database experts often have excellent articles on performance tuning.
-
Web Server Documentation (Nginx/Apache):
- Official Nginx and Apache documentation for web server optimization and PHP integration.
By continuously learning and experimenting, you can become a LibreNMS power user and a valuable asset in managing complex IT infrastructures.