Skip to content
Author Nejat Hakan
eMail nejat.hakan@outlook.de
PayPal Me https://paypal.me/nejathakan


Network/Server Monitoring LibreNMS

LibreNMS is a powerful, open-source, and feature-rich network monitoring system that automatically discovers, polls, and graphs data from a wide range of network hardware and operating systems. This guide provides a comprehensive walkthrough from basic setup to advanced customization, designed for university students and aspiring IT professionals who want to master self-hosted network monitoring. Each section includes theoretical knowledge followed by practical, hands-on workshops.

Introduction to LibreNMS

Welcome to the world of LibreNMS! Before we dive into the technical intricacies of installation and configuration, it's essential to understand what LibreNMS is, what it can do for you, and why it has become a popular choice for network monitoring in organizations of all sizes. This introductory section will lay the groundwork for your journey with LibreNMS.

What is LibreNMS?

LibreNMS is a fully featured, open-source network monitoring system (NMS) that provides a wealth of information about your network infrastructure and servers. It is a community-driven fork of Observium, and its development is active and robust. At its core, LibreNMS uses the Simple Network Management Protocol (SNMP) to poll devices, but it also supports other methods for data collection. It can monitor a vast array of device types, including routers, switches, firewalls, servers (Linux, Windows, etc.), printers, and even specialized hardware like UPS systems and environmental sensors.

The primary goal of LibreNMS is to provide a centralized platform for IT administrators and network engineers to gain visibility into the health, performance, and availability of their IT assets. It achieves this by collecting metrics, storing them in a time-series database (typically RRDtool), and presenting them through a user-friendly web interface with graphs, dashboards, and alerting capabilities.

Key characteristics of LibreNMS include:

  • Auto-discovery: LibreNMS can automatically discover devices on your network based on various protocols like SNMP, CDP (Cisco Discovery Protocol), LLDP (Link Layer Discovery Protocol), OSPF, and BGP.
  • Extensibility: It supports a wide range of operating systems and network hardware out-of-the-box and can be extended to support new devices and applications through custom MIBs (Management Information Bases) and scripts.
  • Alerting System: A flexible alerting system allows you to define rules and receive notifications through various channels (email, Slack, Telegram, etc.) when specific conditions are met (e.g., a device is down, CPU usage is high).
  • Distributed Polling: For larger networks, LibreNMS supports distributed pollers to scale monitoring capabilities across multiple locations or network segments.
  • API Access: A comprehensive API allows for integration with other systems and automation of tasks.

Key Features and Benefits

LibreNMS offers an impressive suite of features that make it a compelling choice for network monitoring:

  • Automatic Discovery: As mentioned, it can automatically discover your network devices, reducing manual configuration effort.
  • Wide Device Support: Supports a vast range of network hardware and operating systems from numerous vendors. This includes common metrics like CPU, memory, storage, network interface traffic, temperature, and voltage.
  • Customizable Dashboard: Allows users to create personalized dashboards displaying the most relevant information at a glance.
  • Flexible Alerting: Highly configurable alerting system with support for various notification methods and customizable alert templates. You can set up alerts based on thresholds, device status, specific OIDs, and more.
  • Graphing: Generates detailed historical graphs for collected metrics, enabling trend analysis and capacity planning. It primarily uses RRDtool for this purpose.
  • Billing System: Includes a traffic accounting system that can be used to monitor and bill bandwidth usage for specific ports or customers.
  • Integration: Integrates with other tools and services like Oxidized (for network device configuration backup), Smokeping (for latency monitoring), and various authentication backends (LDAP, RADIUS, Active Directory).
  • Open Source and Community Driven: Being open source means it's free to use, modify, and distribute. The active community provides support, contributes new features, and ensures the software stays up-to-date.
  • User-Friendly Web Interface: Provides an intuitive web UI for managing devices, viewing data, and configuring the system.

Benefits of using LibreNMS:

  • Improved Network Visibility: Gain a clear understanding of what's happening on your network in real-time.
  • Proactive Problem Detection: Identify and address issues before they impact users or services.
  • Reduced Downtime: Faster detection and diagnosis of problems lead to quicker resolution and minimized downtime.
  • Capacity Planning: Historical data and trend analysis help in planning for future growth and resource allocation.
  • Cost-Effective: Being open source, it eliminates licensing fees associated with commercial NMS solutions.
  • Customization: Tailor the monitoring setup to your specific needs and environment.

Architecture Overview

Understanding the basic architecture of LibreNMS helps in troubleshooting and extending its capabilities. A typical LibreNMS setup consists of several key components:

  1. LibreNMS Core Application: This is the PHP-based application that provides the web interface, device discovery logic, polling engine, and alerting system.
  2. Web Server: A web server like Nginx or Apache is required to serve the LibreNMS web interface. It processes HTTP requests and passes PHP requests to the PHP interpreter.
  3. PHP Interpreter: LibreNMS is written in PHP, so a PHP interpreter (PHP-FPM or mod_php) is essential to execute the application code.
  4. Database Server: LibreNMS uses a MySQL or MariaDB database to store device information, configuration settings, event logs, and some performance data (though most time-series data is in RRD files).
  5. SNMP Tools: Utilities like snmpget, snmpwalk (part of net-snmp or snmp) are used by the poller to query SNMP-enabled devices.
  6. RRDtool: This is a crucial component used to store and display time-series data (e.g., CPU usage, network traffic) in Round Robin Database (RRD) files. RRDtool is responsible for generating the graphs you see in LibreNMS.
  7. Poller (poller.php): A PHP script, typically run via cron, that queries devices for data at regular intervals (usually every 5 minutes).
  8. Discovery (discovery.php): Another PHP script, run via cron, that automatically discovers new devices and updates information about existing devices (e.g., new interfaces, changed IP addresses).
  9. Scheduler (cronic and laravel-scheduler): LibreNMS uses a scheduler to manage the execution of various background tasks, including polling, discovery, alerting, and housekeeping.

Data Flow:

  • The discovery process finds devices and their capabilities.
  • The poller queries these devices for performance metrics via SNMP (or other methods).
  • Collected data is stored:
    • Device metadata and some states go into the SQL database.
    • Time-series performance data is stored in RRD files.
  • The web server and PHP application present this data to the user through the web interface.
  • The alerting system checks defined rules against the collected data and sends notifications if necessary.

In larger environments, Distributed Pollers can be set up. These are separate instances of the poller running on different machines, which then report data back to the central LibreNMS server. This helps distribute the polling load and allows monitoring of devices in remote or isolated network segments.

Why Self-Host LibreNMS?

While SaaS (Software as a Service) monitoring solutions exist, self-hosting LibreNMS offers several distinct advantages, particularly for those who want greater control, customization, and often, cost savings in the long run:

  1. Complete Control: You have full control over the server, data, and configuration. This means you can customize every aspect of the monitoring system to fit your exact needs, integrate it deeply with your existing infrastructure, and manage security according to your own policies.
  2. Data Privacy and Security: Your monitoring data, which can be sensitive, stays within your own infrastructure. This is crucial for organizations with strict data sovereignty or compliance requirements.
  3. No Vendor Lock-in: As an open-source solution, you are not tied to a specific vendor's roadmap, pricing changes, or terms of service. You have the freedom to modify the source code if needed.
  4. Cost-Effectiveness: While there's an initial setup effort and ongoing maintenance, self-hosting can be significantly cheaper than commercial SaaS solutions, especially as the number of monitored devices grows. There are no per-device or per-sensor licensing fees.
  5. Learning Opportunity: Self-hosting provides an invaluable learning experience in server administration, network management, database management, and the intricacies of monitoring protocols. This is particularly beneficial for students and IT professionals looking to deepen their skills.
  6. Customization and Extensibility: LibreNMS is highly extensible. You can write your own pollers, discovery modules, alert templates, and integrate with other internal tools. This level of customization is often not possible with SaaS offerings.
  7. Offline Access: If your monitoring needs are for an internal network without reliable internet access or if you prefer an air-gapped monitoring solution, self-hosting is the way to go.

However, self-hosting also comes with responsibilities:

  • Setup and Configuration: You are responsible for installing and configuring all components.
  • Maintenance: This includes OS updates, LibreNMS updates, database backups, and general system upkeep.
  • Troubleshooting: You'll need to diagnose and fix any issues that arise.

For many, the benefits of control, customization, and cost savings outweigh these responsibilities, making self-hosted LibreNMS an attractive option.

Workshop Introduction to LibreNMS

Objective: To familiarize yourself with the LibreNMS project, its documentation, and community resources. This workshop does not involve any installation yet but focuses on exploration.

Prerequisites:

  • A computer with internet access.
  • A web browser.

Tasks:

  1. Explore the Official LibreNMS Website:

    • Navigate to the official LibreNMS website: https://www.librenms.org/
    • Read the "About" section to understand the project's philosophy.
    • Browse the "Features" section. Identify three features that you find most compelling and think about how they could be useful in a real-world scenario (e.g., monitoring a university campus network or a small business IT infrastructure).
    • Look at the "Screenshots" or "Demo" section to get a visual feel for the interface.
  2. Dive into the Documentation:

    • Find the official LibreNMS documentation: https://docs.librenms.org/
    • Locate the "Installation" section. Briefly look at the different installation methods available (e.g., Ubuntu, CentOS, Docker). Don't worry about understanding all the details yet.
    • Find the "Supported Devices" page. Are there any devices or operating systems listed that you are familiar with or have access to? This will be helpful for later workshops.
    • Skim through the "Alerting" section to get an idea of how alerts are configured.
  3. Check out Community Resources:

    • LibreNMS has a strong community. Find links to their community forum (e.g., on community.librenms.org) or Discord/IRC channels.
    • Visit the forum and browse a few recent topics. Notice the types of questions being asked and the support provided by the community. This will be a valuable resource if you encounter issues later.
    • Explore the LibreNMS GitHub repository: https://github.com/librenms/librenms
      • Look at the "Issues" tab. This can give you an idea of ongoing development, bug reports, and feature requests.
      • Check the "Pull Requests" tab to see contributions being made to the project.
      • Note the activity level (e.g., last commit date, number of contributors) to gauge the health of the project.
  4. Consider a Use Case:

    • Think about a hypothetical (or real) environment you might want to monitor. This could be:
      • Your home network (router, PCs, a Raspberry Pi).
      • A small lab environment with a few virtual machines.
      • A department's IT resources at a university.
    • List 3-5 key things you would want to monitor in this environment (e.g., server uptime, router bandwidth, disk space on a file server). How might LibreNMS help you achieve this?

Deliverables/Reflection (for your own notes):

  • A list of three compelling LibreNMS features and their potential applications.
  • Notes on the general structure of the LibreNMS documentation and where to find key information.
  • An impression of the LibreNMS community and its activity.
  • A brief description of a potential use case for LibreNMS relevant to you.

This exploratory workshop will give you a solid conceptual foundation before you start with the hands-on installation and configuration in the subsequent sections. Understanding the "what" and "why" will make the "how" much more meaningful.

Basic LibreNMS Setup and Configuration

This section guides you through the essential first steps of getting a LibreNMS instance up and running. We will cover system requirements, different installation approaches, the initial web-based configuration, and finally, adding your very first device to monitor. By the end of this section, you will have a functional LibreNMS server polling data from at least one host.

1. System Requirements and Prerequisites

Before you begin installing LibreNMS, it's crucial to ensure your environment meets the necessary hardware and software requirements. Proper planning at this stage can save you a lot of trouble later on.

Hardware Recommendations

LibreNMS can run on a variety of hardware, from a Raspberry Pi (for very small setups) to powerful dedicated servers. The required resources depend heavily on the number of devices and sensors you plan to monitor.

  • CPU:
    • Small (1-50 devices): 1-2 CPU cores (e.g., modern Raspberry Pi 4, small VM).
    • Medium (50-200 devices): 2-4 CPU cores. A reasonably modern dual-core or quad-core processor.
    • Large (200+ devices): 4+ CPU cores. Faster cores are generally better than more slow cores for poller performance. For very large deployments (1000+ devices), consider 8+ cores and investigate distributed pollers.
  • Memory (RAM):
    • Small: At least 2GB RAM is recommended. While it might run on less, performance will suffer, especially with the web UI and database.
    • Medium: 4GB - 8GB RAM.
    • Large: 8GB+ RAM. More RAM helps with database caching and PHP processes.
  • Storage:
    • Type: SSDs (Solid State Drives) are highly recommended for the operating system, LibreNMS application, database, and especially for RRD files. RRD files involve frequent small writes, and SSDs significantly improve I/O performance, impacting graph generation and poller speed.
    • Capacity: This depends on the number of devices, ports, sensors, and the data retention period for RRD files.
      • OS and LibreNMS application: 20-30GB is usually sufficient.
      • RRD files: This is the largest consumer. A rough estimate:
        • A device with many ports and sensors can generate 50-200MB of RRD data per year.
        • For 100 devices, averaging 100MB each, you'd need 10GB per year of RRD storage.
        • Start with at least 50-100GB for RRDs and monitor disk usage. It's easier to expand storage later than to run out.
      • Database: The SQL database stores metadata, not the bulk time-series data. It typically grows much slower than RRD storage. 10-20GB is often ample for a long time for medium setups.
  • Network: A stable network connection with sufficient bandwidth is required, especially if monitoring many devices or remote devices. A 1 Gbps network interface is standard.

Virtualization:
LibreNMS runs very well in a virtual machine (VM). This offers benefits like easy snapshots, resource scalability, and hardware independence. Ensure your hypervisor provides good I/O performance to the VM, especially for storage.

Software Dependencies

LibreNMS has several software dependencies. The exact versions might change, so always consult the official LibreNMS documentation for the most current requirements.

  • Operating System:
    • Primarily developed and tested on Linux. Recommended distributions:
      • Ubuntu Server (LTS releases, e.g., 20.04, 22.04): Often preferred due to good community support and up-to-date packages.
      • Debian (Stable releases): Similar to Ubuntu, very stable.
      • CentOS Stream / RHEL / Rocky Linux / AlmaLinux: Also well-supported.
    • It's crucial to use a 64-bit OS.
    • A minimal server installation is generally preferred to avoid unnecessary software and potential conflicts.
  • Web Server:
    • Nginx: Highly recommended for performance and efficiency.
    • Apache: Also supported.
  • PHP:
    • Specific PHP versions are required (e.g., PHP 7.4, 8.1, 8.2 – check docs for current).
    • Numerous PHP extensions are needed: gd, mysql, snmp, xml, mbstring, tokenizer, json, zip, curl, bcmath, intl, ldap (optional), opcache (recommended for performance).
  • Database Server:
    • MariaDB (version 10.5+ recommended) or MySQL (version 5.7/8.0+ recommended). MariaDB is often favored in the LibreNMS community. Ensure innodb_file_per_table is enabled.
  • SNMP Utilities:
    • net-snmp package (provides snmpget, snmpwalk, etc.) is essential for SNMP polling.
    • snmpd (SNMP daemon) if you want to monitor the LibreNMS server itself via SNMP.
  • Other Tools:
    • git: For downloading and updating LibreNMS.
    • composer: PHP dependency manager, used to install PHP libraries.
    • rrdtool: For storing and graphing time-series data.
    • fping or ping: For ICMP reachability checks. fping is generally preferred for its ability to ping multiple hosts efficiently.
    • python3-memcached (and memcached server): Optional but highly recommended for caching, which improves performance.
    • cron: For scheduling regular tasks like polling and discovery.
    • whois: For some discovery features.
    • mtr-tiny: For traceroute functionality.
    • imagemagick: For some image manipulations.
    • Nmap: For host discovery and port scanning.

It's critical to install the correct versions of these dependencies, especially PHP and its extensions, as mismatches can lead to installation failures or runtime errors. The official LibreNMS installation guides usually provide precise commands to install all necessary dependencies for supported operating systems.

Network Considerations

  • Connectivity: The LibreNMS server must have network connectivity to all devices it intends to monitor. This means:
    • IP reachability (can it ping the devices?).
    • SNMP traffic (UDP port 161 by default) must be allowed from the LibreNMS server to the monitored devices. Firewalls on the devices themselves, or network firewalls between LibreNMS and the devices, must be configured accordingly.
  • DNS: Reliable DNS resolution is important. The LibreNMS server should be able to resolve hostnames of monitored devices if you plan to add them by hostname. Conversely, devices might need to resolve the LibreNMS server's hostname for certain features (like syslog forwarding).
  • NTP (Network Time Protocol): Ensure the LibreNMS server and all monitored devices have their time synchronized using NTP. Time discrepancies can cause issues with data correlation, graph display, and log analysis.
  • IP Addressing: The LibreNMS server itself should have a static IP address or a DHCP reservation.

Workshop System Preparation

Objective:
To prepare a virtual machine or physical server environment that meets the basic software prerequisites for a LibreNMS installation on Ubuntu Server 22.04 LTS. This workshop focuses on setting up the OS and initial packages, not LibreNMS itself.

Prerequisites:

  • Ability to create a new Virtual Machine (using VirtualBox, VMware, Hyper-V, KVM, etc.) or access to a spare physical machine.
  • An Ubuntu Server 22.04 LTS 64-bit ISO image.
  • Internet access for the server.

Tasks:

  1. Install Ubuntu Server 22.04 LTS:

    • Create a new VM with the following minimum specifications (you can increase these if you have more resources):
      • CPU: 2 cores
      • RAM: 4 GB
      • Storage: 50 GB (SSD if possible)
      • Network: Bridged or NAT with port forwarding (ensure it can access the internet and you can access it from your host machine).
    • Boot from the Ubuntu Server 22.04 LTS ISO.
    • Follow the installation prompts:
      • Choose your language.
      • Select keyboard layout.
      • Choose "Ubuntu Server" (not "minimized").
      • Configure networking (DHCP is fine for now, but note the IP address. For a real server, a static IP is recommended).
      • Configure proxy if needed.
      • Use default mirror.
      • For partitioning, "Use an entire disk" and "Set up this disk as an LVM group" is a good default for flexibility. Accept the defaults or customize if you are familiar with partitioning.
      • Confirm destructive action.
      • Set up your profile: Your name, server's name (e.g., librenms-server), username, and a strong password.
      • Important: Choose to "Install OpenSSH server" for remote access.
      • Do not install any of the "Featured Server Snaps" at this stage (like Docker, Nextcloud, etc.). We will install components manually.
    • Wait for the installation to complete, then reboot and remove the installation media.
  2. Initial Server Configuration:

    • Log in to your new Ubuntu server, either directly or via SSH (using the IP address obtained during installation and the username/password you set).
      ssh your_username@server_ip_address
      
    • Update the package list and upgrade existing packages:
      sudo apt update
      sudo apt upgrade -y
      
    • Set the system timezone (replace Your/Timezone with your actual timezone, e.g., Europe/Berlin or America/New_York). You can list timezones with timedatectl list-timezones.
      sudo timedatectl set-timezone Your/Timezone
      timedatectl # Verify the change
      
    • Install basic utilities that are often useful:
      sudo apt install -y curl wget vim git software-properties-common apt-transport-https ca-certificates
      
      • curl, wget: For downloading files.
      • vim: A text editor (or use nano if you prefer).
      • git: For version control, needed for LibreNMS.
      • software-properties-common, apt-transport-https, ca-certificates: For managing software repositories, especially PPAs or third-party repos.
  3. Consider User Management (Optional but Recommended):

    • The user you created during installation has sudo privileges. For LibreNMS, you'll typically run commands as this user or create a dedicated librenms user later. For now, using your sudo-enabled user is fine.

Verification:

  • You should be able to log in via SSH.
  • The system should be up-to-date (sudo apt update && sudo apt list --upgradable should show no or few upgrades).
  • The timezone should be correctly set.

This prepared server is now a clean slate, ready for the specific dependencies LibreNMS requires, which we will cover in the installation workshop. Having a standardized, updated base OS is crucial for a smooth installation process.

2. Installation Methods

LibreNMS offers a few ways to get it installed on your prepared server. The most common and well-documented method is a manual installation, which gives you the most control and understanding of the components. Docker is another popular option for those comfortable with containerization.

Manual Installation Steps (e.g., on Ubuntu/Debian or CentOS/RHEL)

Manual installation involves installing each component (web server, PHP, database, LibreNMS code) step-by-step. While this might seem more involved, it provides a deeper understanding of how LibreNMS works and integrates with the system. The official LibreNMS documentation provides detailed, distribution-specific guides. Here's a conceptual overview of the typical steps involved for an Ubuntu/Debian based system. Always refer to the official documentation for the latest and most precise commands.

General Phases of Manual Installation:

  1. Install Dependencies:

    • Web Server (Nginx or Apache): Install the chosen web server. Nginx is generally recommended.
      # Example for Nginx on Ubuntu
      sudo apt install nginx
      
    • Database Server (MariaDB or MySQL): Install and secure the database server.
      # Example for MariaDB on Ubuntu
      sudo apt install mariadb-server mariadb-client
      sudo mysql_secure_installation # Secure the installation
      
    • PHP and Extensions: Install the required PHP version (e.g., php8.1-fpm) and all necessary PHP extensions listed in the LibreNMS requirements (php-cli, php-mysql, php-snmp, php-xml, php-gd, php-mbstring, php-curl, php-zip, etc.).
      # Example for PHP 8.1 and extensions on Ubuntu
      sudo add-apt-repository ppa:ondrej/php # For latest PHP versions
      sudo apt update
      sudo apt install php8.1-fpm php8.1-cli php8.1-mysql php8.1-snmp php8.1-xml php8.1-gd php8.1-mbstring php8.1-curl php8.1-zip php8.1-bcmath php8.1-intl php8.1-gmp # and others as per docs
      
    • Other Tools: Install snmp, rrdtool, fping, git, composer, imagemagick, mtr-tiny, nmap, etc.
      sudo apt install snmp rrdtool fping git composer imagemagick mtr-tiny nmap python3-memcached whois
      sudo systemctl enable --now memcached # If using memcached
      
  2. Create LibreNMS User and Database:

    • Create a dedicated system user for LibreNMS (e.g., librenms).
      sudo useradd librenms -d /opt/librenms -m -r -s /bin/bash
      
    • Log into MariaDB/MySQL and create a database and user for LibreNMS. Grant appropriate privileges.
      -- Example SQL commands
      CREATE DATABASE librenms CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
      CREATE USER 'librenms'@'localhost' IDENTIFIED BY 'your_strong_password';
      GRANT ALL PRIVILEGES ON librenms.* TO 'librenms'@'localhost';
      FLUSH PRIVILEGES;
      EXIT;
      
  3. Download LibreNMS Code:

    • Clone the LibreNMS repository from GitHub into a directory like /opt/librenms.
      sudo git clone https://github.com/librenms/librenms.git /opt/librenms
      
  4. Set Permissions and Ownership:

    • Set the correct ownership and permissions for the LibreNMS directories and files, as specified in the documentation. This typically involves giving the librenms user and the web server user (e.g., www-data) appropriate access.
      # Example, consult official docs for exact commands
      sudo chown -R librenms:librenms /opt/librenms
      sudo setfacl -d -m g::rwx /opt/librenms/rrd /opt/librenms/logs /opt/librenms/bootstrap/cache/ /opt/librenms/storage/
      sudo setfacl -R -m g::rwx /opt/librenms/rrd /opt/librenms/logs /opt/librenms/bootstrap/cache/ /opt/librenms/storage/
      # Specific permissions for web server user might also be needed
      
  5. Install PHP Dependencies:

    • Run composer install within the LibreNMS directory to download required PHP libraries. This needs to be done as the librenms user or with appropriate permissions.
      cd /opt/librenms
      sudo ./scripts/composer_wrapper.php install --no-dev
      
  6. Configure Web Server:

    • Configure Nginx or Apache to serve the LibreNMS web interface. This involves creating a virtual host configuration file that points to the LibreNMS public directory (/opt/librenms/html) and configures PHP processing.
    • The LibreNMS documentation provides sample configuration files for Nginx and Apache.
    • Enable the new site configuration and restart the web server.
      # Example: copy provided nginx config, enable it, test, and reload nginx
      sudo cp /opt/librenms/misc/librenms.nonroot.nginx /etc/nginx/sites-available/librenms
      # Edit /etc/nginx/sites-available/librenms to set server_name and PHP socket path
      sudo ln -s /etc/nginx/sites-available/librenms /etc/nginx/sites-enabled/
      sudo rm /etc/nginx/sites-enabled/default # Remove default site if it conflicts
      sudo nginx -t # Test configuration
      sudo systemctl restart nginx
      
  7. Configure PHP-FPM (if using Nginx):

    • Ensure PHP-FPM is configured correctly, particularly the user it runs as and the listen socket path, which must match the Nginx configuration. Adjust settings in /etc/php/8.1/fpm/pool.d/www.conf (or a dedicated LibreNMS pool).
    • Common changes include setting user and group to librenms, and configuring the listen directive.
  8. Configure snmpd (Optional, for monitoring LibreNMS server itself):

    • If you want LibreNMS to monitor the server it's running on, configure the local SNMP daemon (snmpd). Copy the provided snmpd.conf from LibreNMS, customize it (especially the community string), and restart snmpd.
  9. Set up Cron Jobs:

    • LibreNMS relies on cron jobs for polling, discovery, and other background tasks. Copy the provided cron file (e.g., /opt/librenms/librenms.cron) to /etc/cron.d/librenms.
      sudo cp /opt/librenms/librenms.cron /etc/cron.d/librenms
      
    • This cron job typically runs every minute and uses LibreNMS's internal scheduler to manage tasks.
  10. Perform Web-Based Installation:

    • Access the LibreNMS web interface in your browser (e.g., http://your_server_ip_or_hostname). This should redirect you to the installer (install.php).
    • Follow the on-screen instructions, which will check prerequisites, configure database settings (you'll enter the database name, user, and password created earlier), and create an admin user.

This manual process is detailed but gives you full insight. Always follow the official LibreNMS installation guide for your specific OS, as commands and paths can vary.

Docker Installation (Brief Overview and Pros/Cons)

LibreNMS also provides official Docker images, which can simplify deployment, especially if you are already using Docker.

How it works:
The Docker setup typically involves using docker-compose to orchestrate multiple containers:

  • A container for the LibreNMS application itself (with PHP and web server).
  • A container for the MariaDB/MySQL database.
  • Optionally, containers for memcached, RRDcached, or distributed pollers.

Persistent data (database files, RRD files, configuration) is usually stored in Docker volumes to survive container restarts.

Pros of Docker Installation:

  • Simplified Deployment: Fewer manual steps to install dependencies on the host system. Dependencies are managed within the containers.
  • Isolation: LibreNMS and its components run in isolated environments, reducing conflicts with other applications on the host.
  • Portability: Easier to move the LibreNMS setup between different Docker hosts.
  • Reproducibility: Dockerfiles and docker-compose.yml define the environment, making setups more consistent.
  • Easier Upgrades (Potentially): Upgrading can sometimes be as simple as pulling a new image version and restarting containers (though database migrations and data integrity still need care).

Cons of Docker Installation:

  • Learning Curve: Requires familiarity with Docker and docker-compose concepts.
  • Abstraction: Can make troubleshooting more complex if issues occur within a container, as you're dealing with an extra layer of abstraction.
  • Resource Overhead: Docker itself introduces some resource overhead, though usually minimal.
  • Networking Complexity: Docker networking can be tricky, especially when integrating with existing network infrastructure or setting up distributed pollers that need to communicate with devices outside the Docker network.
  • Customization: While possible, customizing the Dockerized setup (e.g., adding specific PHP extensions not in the official image) might require building your own custom images.

The official LibreNMS Docker repository is at https://github.com/librenms/docker. It provides docker-compose.yml examples and instructions.

Which method to choose?

  • Manual Installation: Recommended if you want maximum control, a deep understanding of the system, or if you are not yet comfortable with Docker. It's also the most thoroughly documented method in the main LibreNMS docs.
  • Docker Installation: A good choice if you are experienced with Docker, want rapid deployment, or prefer containerized applications.

For this guide, we will focus on the manual installation method in the workshop to provide a foundational understanding.

Workshop Installing LibreNMS (Manual Method on Ubuntu 22.04)

Objective: To perform a manual installation of LibreNMS on the Ubuntu Server 22.04 LTS prepared in the previous workshop. We will use Nginx, MariaDB, and PHP 8.1.

Prerequisites:

  • The Ubuntu Server 22.04 LTS system prepared in the "Workshop System Preparation."
  • Root or sudo access to this server.
  • Internet connectivity on the server.
  • You should have noted the IP address of your server.

Important Note: These instructions are based on common practices and LibreNMS documentation at the time of writing. Always cross-reference with the latest official LibreNMS installation guide for Ubuntu/Debian, as package names, commands, or recommended versions might change: https://docs.librenms.org/Installation/Install-LibreNMS/

Steps:

  1. Install Essential Packages (Web Server, Database, PHP, Utilities):

    • Log in to your Ubuntu server via SSH.
    • Add the PHP PPA for up-to-date PHP versions (if not already done, though official Ubuntu 22.04 repos have PHP 8.1):
      sudo apt update
      sudo apt install -y software-properties-common
      sudo add-apt-repository ppa:ondrej/php -y
      sudo apt update
      
    • Install Nginx, MariaDB, PHP 8.1 and its extensions, and other required tools. This is a comprehensive list; some might already be installed from the previous workshop.
      sudo apt install -y acl curl composer fping git graphviz imagemagick mariadb-client mariadb-server mtr-tiny nginx-full nmap python3-pymysql python3-dotenv python3-redis python3-setuptools python3-pip python3-memcache rrdtool snmp snmpd whois unzip \
      php8.1-cli php8.1-fpm php8.1-curl php8.1-gd php8.1-gmp php8.1-intl php8.1-mbstring php8.1-mysql php8.1-snmp php8.1-xml php8.1-zip php8.1-memcached php8.1-bcmath
      
      • acl: For file access control lists.
      • composer: PHP dependency manager.
      • fping: For quick ICMP checks.
      • graphviz: For some graph rendering.
      • imagemagick: Image processing.
      • python3-pymysql, python3-dotenv, python3-redis, python3-setuptools, python3-pip, python3-memcache: Python components used by LibreNMS scripts or for integrations.
  2. Configure MariaDB:

    • Start and enable MariaDB:
      sudo systemctl enable --now mariadb
      
    • Run the secure installation script (set a root password, remove anonymous users, disallow remote root login, remove test database):
      sudo mysql_secure_installation
      
      (Answer 'Y' to most questions. Set a strong root password when prompted.)
    • Log in to MariaDB as root:
      sudo mysql -u root -p
      # Enter the root password you just set
      
    • Create the LibreNMS database and user. Replace StrongPasswordHere with a very strong, unique password.
      CREATE DATABASE librenms CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
      CREATE USER 'librenms'@'localhost' IDENTIFIED BY 'StrongPasswordHere';
      GRANT ALL PRIVILEGES ON librenms.* TO 'librenms'@'localhost';
      FLUSH PRIVILEGES;
      EXIT;
      
    • Modify MariaDB configuration for LibreNMS compatibility. Edit /etc/mysql/mariadb.conf.d/50-server.cnf (or a similar file like my.cnf):
      sudo vim /etc/mysql/mariadb.conf.d/50-server.cnf
      
      Under the [mysqld] section, add or modify these lines:
      innodb_file_per_table=1
      lower_case_table_names=0 # As per recent LibreNMS recommendations for consistency
      sql_mode="" # Clear sql_mode to avoid strict mode issues
      
      • innodb_file_per_table=1: Important for InnoDB performance and manageability.
      • lower_case_table_names=0: LibreNMS generally expects case-sensitive table names.
      • sql_mode="": Some older LibreNMS versions or specific OS defaults for SQL mode could cause issues; clearing it is often safer.
    • Restart MariaDB:
      sudo systemctl restart mariadb
      
  3. Create LibreNMS User and Download Code:

    • Create the librenms system user:
      sudo useradd librenms -d /opt/librenms -M -r -s /bin/bash
      # -M: do not create home directory, we will use /opt/librenms
      # -r: create a system user
      
    • Clone LibreNMS into /opt/librenms:
      sudo git clone https://github.com/librenms/librenms.git /opt/librenms
      
  4. Set Permissions:

    • This is a critical step. Follow the official documentation carefully.
      sudo chown -R librenms:librenms /opt/librenms
      sudo chmod 771 /opt/librenms # Allow librenms user and group to write
      sudo setfacl -d -m g::rwx /opt/librenms/rrd /opt/librenms/logs /opt/librenms/bootstrap/cache/ /opt/librenms/storage/
      sudo setfacl -R -m g::rwx /opt/librenms/rrd /opt/librenms/logs /opt/librenms/bootstrap/cache/ /opt/librenms/storage/
      
      • These commands ensure the librenms user and its group can manage files, and also set default ACLs so new files/directories inherit correct permissions. The web server user will need to be added to the librenms group.
  5. Configure PHP and Install PHP Dependencies:

    • Configure PHP Timezone. Find your PHP-FPM config (e.g., /etc/php/8.1/fpm/php.ini) and PHP-CLI config (e.g., /etc/php/8.1/cli/php.ini):
      sudo vim /etc/php/8.1/fpm/php.ini
      sudo vim /etc/php/8.1/cli/php.ini
      
      Find the date.timezone setting, uncomment it, and set it to your timezone (e.g., date.timezone = Europe/Berlin).
    • Restart PHP-FPM:
      sudo systemctl restart php8.1-fpm
      
    • Install PHP dependencies using Composer (run as the librenms user or ensure correct permissions afterwards):
      cd /opt/librenms
      # Run composer_wrapper.php as the librenms user:
      sudo su - librenms -c './scripts/composer_wrapper.php install --no-dev'
      # If the above fails due to sudo/path issues, you might need to run it as root and then fix ownership:
      # sudo ./scripts/composer_wrapper.php install --no-dev
      # sudo chown -R librenms:librenms /opt/librenms/vendor /opt/librenms/composer.json /opt/librenms/composer.lock
      
      This step can take a few minutes as it downloads many libraries.
  6. Configure Web Server (Nginx):

    • Add the web server user (www-data for Nginx/Apache on Debian/Ubuntu) to the librenms group:
      sudo usermod -a -G librenms www-data
      
    • Configure the Nginx virtual host for LibreNMS:
      sudo cp /opt/librenms/misc/librenms.nonroot.nginx /etc/nginx/sites-available/librenms
      # Edit the file to set your server_name and check PHP socket:
      sudo vim /etc/nginx/sites-available/librenms
      
      • Change server_name librenms.example.com; to server_name your_server_ip_or_hostname; (e.g., server_name 192.168.1.100;).
      • Ensure the fastcgi_pass directive points to the correct PHP-FPM socket, usually unix:/run/php/php8.1-fpm.sock.
    • Enable the new site and remove the default:
      sudo ln -s /etc/nginx/sites-available/librenms /etc/nginx/sites-enabled/librenms
      sudo rm /etc/nginx/sites-enabled/default # If it exists and conflicts
      
    • Test Nginx configuration and restart Nginx:
      sudo nginx -t
      sudo systemctl restart nginx
      
  7. Configure snmpd (for monitoring the LibreNMS server itself):

    • Copy the example snmpd.conf:
      sudo cp /opt/librenms/snmpd.conf.example /etc/snmp/snmpd.conf
      sudo vim /etc/snmp/snmpd.conf
      
    • Edit the file and change RANDOMSTRINGGOESHERE to a secure, unique community string (e.g., librenmscommunity).
    • Download the distro script which helps LibreNMS identify the OS:
      sudo curl -o /usr/bin/distro https://raw.githubusercontent.com/librenms/librenms-agent/master/snmp/distro
      sudo chmod +x /usr/bin/distro
      
    • Restart and enable snmpd:
      sudo systemctl enable --now snmpd
      
  8. Set up Cron Job:

    • Copy the LibreNMS cron job file:
      sudo cp /opt/librenms/librenms.cron /etc/cron.d/librenms
      
    • Ensure the cron daemon is running:
      sudo systemctl status cron # Should be active (running)
      
  9. Perform Web-Based Installation:

    • Open your web browser and navigate to http://your_server_ip_or_hostname. You should be redirected to install.php.
    • Follow the on-screen instructions:
      • Stage 0 (Checks): All checks should be green (or yellow for optional items). If there are red items, you must fix them before proceeding.
      • Stage 1 (DB Connection):
        • DB User: librenms
        • DB Password: StrongPasswordHere (the one you set)
        • DB Name: librenms
      • Click "Next Stage". If successful, it will say "DB Schema: OK".
      • Stage 2 (Create Admin User):
        • Create your admin username, password, and email.
      • Click "Finish install".
    • You may see a message "The poller has not run recently." This is normal at first. The cron job will pick it up.
    • You might be prompted to run ./scripts/database-schema.sh if the schema isn't fully up-to-date. If so, run as librenms user:
      sudo su - librenms -c '/opt/librenms/scripts/database-schema.sh'
      
    • The final step might involve running daily.sh once manually to ensure everything is set up:
      sudo su - librenms -c '/opt/librenms/daily.sh'
      

Verification:

  • You should be able to log in to the LibreNMS web interface with the admin credentials you created.
  • Navigate to Gear Icon > Validate Config. It should show that your config is OK. If not, address the reported issues.
  • Wait 5-10 minutes for the cron jobs to run. The message about pollers not running should disappear.
  • The LibreNMS server itself (localhost) might be automatically added as a device if snmpd is correctly configured and discovery runs.

Congratulations! You have manually installed LibreNMS. This is a significant achievement and provides a solid foundation for network monitoring. The process is involved, but it ensures you understand each component's role.

3. Initial Configuration

Once LibreNMS is installed and you can access the web interface, there are several initial configuration steps to take to make it fully operational and tailored to your environment. This usually involves going through settings in the web UI.

Web-based Setup Wizard

As seen in the installation workshop, the very first interaction after the base files are in place is often the web-based setup wizard (install.php). This wizard guides you through:

  1. Pre-flight Checks: Verifies that all required PHP extensions are present, file permissions are correct, and other environmental prerequisites are met. It's crucial to resolve any errors reported at this stage.
  2. Database Configuration: Prompts for the database connection details (hostname, username, password, database name) that you created during the manual setup. It then attempts to connect to the database and set up the necessary schema.
  3. Admin User Creation: Allows you to create the first administrative user account for LibreNMS, which you'll use to log in and manage the system.

If you completed the manual installation workshop, you've already gone through this. If for some reason the web installer didn't run or needs to be re-run (e.g., after a fresh git clone before database setup), accessing http://your-librenms-host/install.php would typically trigger it.

Database Configuration

While the web wizard handles the initial connection, sometimes you need to review or modify the database configuration. The primary database settings are stored in the .env file in the root of your LibreNMS installation (e.g., /opt/librenms/.env).

This file is critical and contains sensitive information like your database password. A typical .env file would have entries like:

APP_KEY=base64:someRandomString...
DB_HOST=localhost
DB_DATABASE=librenms
DB_USERNAME=librenms
DB_PASSWORD=your_strong_password
# ... other settings

  • DB_HOST: Usually localhost if the database is on the same server.
  • DB_DATABASE: The name of the LibreNMS database (e.g., librenms).
  • DB_USERNAME: The database user for LibreNMS.
  • DB_PASSWORD: The password for the database user.

If you ever need to change your database password or migrate the database to a different host, you would update this .env file and then potentially clear cached configurations:

# As librenms user or with appropriate permissions in /opt/librenms
php artisan config:clear
php artisan cache:clear
Important: Always back up your .env file.

Creating the First Admin User

This is typically done during the web-based setup wizard. If you need to create additional admin users or manage users, you can do so from within the LibreNMS web interface after logging in:

  • Navigate to Gear Icon (Top Right) > Users.
  • Here you can:
    • Add User:
      Create new user accounts. You can assign them different access levels (e.g., Normal User, Global Read, Administrator).
    • Manage Existing Users:
      Edit user details, change passwords, or delete users.

User levels define what a user can see and do within LibreNMS:

  • Normal User:
    Can only see devices they are permitted to see (via device group permissions or if the device is marked as public). Can acknowledge alerts for permitted devices.
  • Global Read:
    Can see all devices and most information but cannot make changes.
  • Administrator:
    Full access to all settings and devices.

For initial setup, the admin user created during the wizard is sufficient.

Basic System Settings

After the initial installation, you should review and configure several global settings. Access these via Gear Icon (Top Right) > Global Settings.

Key areas to review:

  1. System Configuration (System > General):

    • Base URL: Ensure this is correctly set to the URL you use to access LibreNMS (e.g., http://librenms.example.com). This is important for links in emails and integrations.
    • fping / fping6 Location: Verify paths to fping and fping6 executables. These are usually auto-detected.
    • RRDtool Version: Select the RRDtool version installed on your system.
    • Enable Syslog: If you plan to send syslog messages from your devices to LibreNMS for centralized logging. This requires additional configuration on both LibreNMS and your devices.
  2. Polling Settings (System > Poller):

    • Poller Modules: Enable or disable specific poller modules (e.g., OS updates, BGP, OSPF). Only enable modules relevant to your devices to save polling time and resources.
    • Discovery Modules: Similar to poller modules, enable or disable discovery modules.
    • SNMP Settings: Configure default SNMP version, port, and timeout. You can override these per device.
  3. Alerting Settings (Alerting > General):

    • Default "from" email address: Set the email address from which alert notifications will be sent.
    • Default contact email address: An email address to receive test alerts or general admin notifications.
  4. Authentication (System > Authentication):

    • Configure authentication methods. By default, it uses local database users. You can integrate with LDAP, RADIUS, or Active Directory for centralized user management.
  5. Interface Settings (System > Web UI):

    • Customize aspects of the web interface, such as default dashboard pages, themes, and visual elements.
  6. Distributed Poller Settings (if applicable):

    • If you plan to use distributed pollers, this is where you configure poller groups and related settings. We will cover this in more detail in the advanced section.

Take your time to go through these settings. Many defaults are sensible, but customizing them to your environment from the start is good practice.

Workshop First Steps with LibreNMS

Objective: To log in to your newly installed LibreNMS, explore basic global settings, and validate the installation.

Prerequisites:

  • A successfully installed LibreNMS instance from the previous workshop.
  • Web browser access to the LibreNMS UI.
  • Admin credentials created during the web installation.

Tasks:

  1. Log In and Initial Exploration:

    • Open your web browser and navigate to your LibreNMS URL (e.g., http://your_server_ip_or_hostname).
    • Log in with the admin username and password you created.
    • You will land on the main dashboard. It will likely be empty or show only "localhost" if snmpd was configured on the LibreNMS server and discovery has run.
  2. Validate Configuration:

    • In the top right corner, click the Gear Icon.
    • From the dropdown menu, select Validate Config.
    • Review the output. Ideally, it should say:
      ====================================
      Component | Version
      --------- | -------
      LibreNMS  | x.x.x (e.g., 23.11.0)
      DB Schema | xxx
      PHP       | 8.1.x
      Python    | 3.x.x
      MySQL     | 10.6.x-MariaDB
      RRDTool   | 1.7.x
      SNMP      | NET-SNMP 5.9.x
      ====================================
      
      [OK]    Composer Version: 2.x.x
      [OK]    Dependencies up-to-date.
      [OK]    Database connection successful
      [OK]    Database schema correct
      [INFO]  Detected Python Wrapper Version 31
      
    • If there are any [FAIL] or [WARN] messages, they will usually provide information or links to documentation on how to resolve them. Common issues might relate to file permissions, PHP settings, or cron jobs not running correctly. Address these before proceeding further. For instance, if it complains about daily.sh not running, you might need to run it manually once:
      sudo su - librenms -c "/opt/librenms/daily.sh"
      
  3. Review Global Settings:

    • Click the Gear Icon > Global Settings.
    • System > General:
      • Verify Base URL for Web UI. If it's http://localhost and you access it via IP or hostname, change it to the correct URL (e.g., http://192.168.1.100). This is important for links in alerts.
      • Note the Installation ID. This is unique to your install.
    • System > Poller > SNMP Settings:
      • Observe the default SNMP settings: version (e.g., v2c), port (161), default community strings (often public). We will use these when adding devices.
    • System > Web UI > General:
      • Explore options like Default front page. You might change this later once you have favorite dashboards.
    • Alerting > General:
      • Set Default "from" email address to something like librenms@yourdomain.com (even if you haven't configured mail sending yet, it's good practice).
      • Set Default contact email address to your own email address.
    • Click Save Settings at the bottom of any page where you make changes.
  4. Check System Health and Logs:

    • Navigate to Health (Heart Icon in top menu) > Poller Performance. This shows statistics about your poller runs. Initially, it might not have much data.
    • Navigate to Logs > Event Log. This log shows significant events in LibreNMS, such as devices being added, going down, or alerts being triggered.
    • Navigate to Logs > LibreNMS Log. This provides more detailed logging from the LibreNMS application itself, useful for troubleshooting.

Deliverables/Reflection:

  • Confirmation that your LibreNMS configuration is valid.
  • Familiarity with the Global Settings menu and key options.
  • Understanding where to find basic health information and logs.

This workshop ensures your LibreNMS installation is sound and configured with some basic, essential parameters. You are now ready to start adding devices and seeing LibreNMS in action.

4. Adding Your First Device

The core function of LibreNMS is to monitor devices. This subchapter will cover the fundamentals of SNMP, how to configure it on common operating systems, and then how to add a device to LibreNMS for monitoring.

Understanding SNMP (v1, v2c, v3)

Simple Network Management Protocol (SNMP) is the primary protocol LibreNMS uses to communicate with and gather data from network devices and servers. Understanding its basics is crucial.

  • What is SNMP? SNMP is an application-layer protocol defined by the Internet Architecture Board (IAB) for exchanging management information between network devices. It is part of the TCP/IP protocol suite. SNMP enables network administrators to manage network performance, find and solve network problems, and plan for network growth.

  • Key Components of SNMP:

    • Managed Devices: These are network elements like routers, switches, servers, printers, or any device that runs SNMP agent software.
    • SNMP Agent: Software that runs on managed devices. It collects and stores management information locally and responds to SNMP queries from an NMS.
    • Network Management System (NMS): Software that runs on a management station (like your LibreNMS server). The NMS queries agents, receives responses, and presents data to users.
    • Management Information Base (MIB): A hierarchical database of information that describes the manageable aspects of a device. Each piece of information (e.g., CPU load, interface traffic counter) is represented by an Object Identifier (OID). MIBs define these OIDs and their meaning, data type, and access permissions (read-only or read-write).
      • Standard MIBs (e.g., MIB-II) are common across most devices.
      • Vendor-specific (enterprise) MIBs provide information unique to a particular vendor's hardware or software. LibreNMS includes support for many common MIBs.
    • Object Identifier (OID): A unique, numeric identifier used to specify a managed object in the MIB tree. For example, .1.3.6.1.2.1.1.1.0 is the OID for sysDescr.0 (system description).
  • SNMP Operations:

    • GET: The NMS retrieves the value of one or more OIDs from an agent.
    • GETNEXT: The NMS retrieves the value of the OID following the one specified. This is used to "walk" a MIB tree.
    • SET: The NMS changes the value of an OID on an agent (if it's read-write and permitted). LibreNMS primarily uses read operations.
    • TRAP: Asynchronous notifications sent from an agent to the NMS to indicate a significant event (e.g., an interface going down, a device rebooting). LibreNMS can receive and process traps.
  • SNMP Versions:

    • SNMPv1: The original version. Simple but lacks strong security. Uses "community strings" (plain text passwords) for authentication. Prone to security risks.
      • Community String: A shared password between the NMS and the agent.
        • Read-Only (RO) Community: Allows the NMS to read data.
        • Read-Write (RW) Community: Allows the NMS to read and modify data (use with extreme caution).
    • SNMPv2c (Community-based SNMPv2): The most widely used version. It offers improvements over v1, such as enhanced error handling and new data types (e.g., 64-bit counters, important for high-speed interfaces). However, it still uses community strings for security, making it vulnerable if not properly managed (e.g., using weak or default community strings like "public" or "private").
    • SNMPv3: The most secure version. It provides:
      • Authentication: Verifies the identity of the sender (NMS or agent) using usernames and authentication protocols (MD5 or SHA).
      • Encryption (Privacy): Encrypts SNMP messages to prevent eavesdropping, using protocols like DES, 3DES, or AES.
      • Message Integrity: Ensures messages haven't been tampered with during transit. SNMPv3 uses a User-based Security Model (USM) and a View-based Access Control Model (VACM). Configuration is more complex than v1/v2c but offers significantly better security. It's highly recommended for production environments.

LibreNMS supports all three versions. For initial learning and internal trusted networks, SNMPv2c is often used due to its simplicity. However, moving to SNMPv3 is a best practice for security.

Configuring SNMP on a Linux Host

To monitor a Linux server with LibreNMS, you need to install and configure an SNMP agent on it. The most common SNMP agent for Linux is net-snmp (package name might be snmpd or net-snmp).

Steps for Ubuntu/Debian:

  1. Install SNMP Agent:

    sudo apt update
    sudo apt install snmpd snmp # snmp package provides snmpwalk utility for testing
    

  2. Configure snmpd:

    • The main configuration file is typically /etc/snmp/snmpd.conf. It's recommended to back up the original before editing.
      sudo cp /etc/snmp/snmpd.conf /etc/snmp/snmpd.conf.original
      sudo vim /etc/snmp/snmpd.conf
      
    • Agent Address: By default, snmpd might only listen on 127.0.0.1 (localhost). You need it to listen on an IP address reachable by your LibreNMS server (e.g., all interfaces or a specific interface IP).
      • Find the line starting with agentaddress or similar. If it's agentaddress 127.0.0.1,[::1], change it to listen on all IPv4 and IPv6 interfaces, or a specific IP:
        # Listen on all IPv4 and IPv6 interfaces, UDP port 161
        agentaddress udp:161,udp6:161
        
        # Or, listen on a specific IP address (e.g., 192.168.1.50)
        # agentaddress udp:192.168.1.50:161
        
    • Configure Community String (for SNMPv2c):
      • The rocommunity directive defines a read-only community string.
      • Syntax: rocommunity <community_string> [source] [OID]
      • For basic setup, you can define a community string and restrict it to your LibreNMS server's IP address for better security.
        # Replace 'YourCommunityString' with a strong, unique string
        # Replace 'librenms_server_ip' with the actual IP of your LibreNMS server
        rocommunity YourCommunityString librenms_server_ip
        
        # Example:
        # rocommunity MySecretSNMPv2c 192.168.1.100
        
        # If you want to allow any source (less secure, for testing only):
        # rocommunity YourCommunityString default -V systemonly # Limits to system MIB group
        # rocommunity YourCommunityString # Allows access to full MIB tree from anywhere (NOT RECOMMENDED for production)
        
        It's good practice to use a non-default community string (i.e., not "public").
    • System Information: You can set system location and contact information, which LibreNMS can pick up.
      syslocation "Server Room A, Rack 2, Unit 5"
      syscontact admin@example.com
      
    • Extend snmpd (Optional, for more data): LibreNMS provides an snmpd.conf example that includes extend scripts for additional monitoring capabilities (like OS updates, disk I/O, etc.). You can download distro script for OS detection:
      sudo curl -o /usr/bin/distro https://raw.githubusercontent.com/librenms/librenms-agent/master/snmp/distro
      sudo chmod +x /usr/bin/distro
      
      And then add lines like this to your snmpd.conf:
      # This allows LibreNMS to detect the OS distribution
      extend .1.3.6.1.4.1.2021.7890.1 distro /usr/bin/distro
      
      LibreNMS documentation often has more extend examples or specific agent setup scripts.
  3. Restart snmpd Service:

    sudo systemctl restart snmpd
    sudo systemctl enable snmpd # To start on boot
    

  4. Firewall Configuration:

    • If you are using a firewall (like ufw on Ubuntu), you need to allow incoming SNMP traffic (UDP port 161) from your LibreNMS server.
      # Example for ufw, allowing from a specific IP
      sudo ufw allow from librenms_server_ip to any port 161 proto udp
      # Example: sudo ufw allow from 192.168.1.100 to any port 161 proto udp
      sudo ufw reload
      
  5. Test SNMP Locally and Remotely:

    • Locally (on the Linux host being configured):
      # Replace YourCommunityString with what you configured
      snmpwalk -v2c -c YourCommunityString localhost system
      # You should see output related to the system MIB group (description, uptime, etc.)
      
    • Remotely (from the LibreNMS server):
      # Replace YourCommunityString and linux_host_ip
      snmpwalk -v2c -c YourCommunityString linux_host_ip system
      # If this works, LibreNMS should be able to poll the device.
      # If it times out, check firewall rules, snmpd listening address, and community string.
      

Configuring SNMPv3 is more involved, requiring user creation, authentication, and privacy protocols. We will touch upon this in the advanced security section. For now, SNMPv2c is sufficient for learning.

Configuring SNMP on a Windows Host

Windows Servers also have an SNMP service that can be enabled.

Steps for Windows Server (e.g., 2016, 2019, 2022):

  1. Install SNMP Service Feature:

    • Open Server Manager.
    • Click Manage > Add Roles and Features.
    • Proceed to the Features section.
    • Select SNMP Service. You might also want SNMP WMI Provider if it's listed as an option.
    • Click Add Features if prompted for management tools.
    • Complete the installation.
  2. Configure SNMP Service:

    • Open Services ( services.msc ).
    • Find the SNMP Service, right-click, and select Properties.
    • Agent Tab:
      • Fill in Contact and Location if desired.
      • Under Service, check all options (Physical, Applications, Datalink and subnetwork, Internet, End-to-end). This determines what MIB data is exposed.
    • Traps Tab (Optional):
      • If you want the Windows server to send SNMP traps to LibreNMS, you can configure trap destinations here. Enter a Community name and add your LibreNMS server's IP address to Trap destinations.
    • Security Tab: This is the most important tab for LibreNMS polling.
      • Accepted community names: Click Add...
        • Choose the Community rights (e.g., READ ONLY).
        • Enter your desired Community Name (e.g., YourWindowsCommunityString). This must match what you configure in LibreNMS.
        • Click Add.
      • Accept SNMP packets from these hosts:
        • Select this option for better security.
        • Click Add... and enter the IP address of your LibreNMS server.
        • Click Add.
        • (Alternatively, "Accept SNMP packets from any host" is less secure but can be used for initial testing).
    • Click Apply and OK.
  3. Restart SNMP Service:

    • In the Services console, right-click SNMP Service and select Restart.
    • Ensure the service Startup Type is set to Automatic.
  4. Windows Firewall:

    • Open Windows Defender Firewall with Advanced Security.
    • Go to Inbound Rules.
    • Find the rules for SNMP Service (UDP-In). There might be predefined rules.
    • Ensure the rule is Enabled.
    • Double-click the rule and go to the Scope tab.
    • Under Remote IP address, you can restrict access to the IP address of your LibreNMS server for better security. Select "These IP addresses", click "Add...", and enter the LibreNMS server's IP.
    • If no predefined rule exists, create a new Inbound Rule:
      • Rule Type: Port
      • Protocol and Ports: UDP, Specific local ports: 161
      • Action: Allow the connection
      • Profile: Select appropriate profiles (Domain, Private, Public - usually Domain and Private).
      • Name: SNMP Allow (UDP 161 for LibreNMS)
      • Optionally, scope it to the LibreNMS server IP.
  5. Test SNMP Remotely:

    • From your LibreNMS server's command line:
      # Replace YourWindowsCommunityString and windows_host_ip
      snmpwalk -v2c -c YourWindowsCommunityString windows_host_ip system
      # If successful, you'll get SNMP output. If not, re-check Windows Firewall, SNMP service configuration (community string, accepted hosts), and network connectivity.
      

Adding a Device via LibreNMS Web UI

Once SNMP is configured on the target device and reachable from the LibreNMS server, you can add it in the LibreNMS web interface.

  1. Log in to LibreNMS.
  2. Navigate to Devices > Add Device.
  3. Fill in the device details:

    • Hostname or IP Address: Enter the IP address or resolvable hostname of the device you want to add (e.g., 192.168.1.50 or linux-server1.yourdomain.local).
    • SNMP Version: Select the SNMP version configured on the device (e.g., v2c or v3).
    • Port: Default is 161. Change if your device uses a non-standard SNMP port.
    • Transport: Default is UDP.
    • SNMP Community / Auth Details:
      • If using SNMPv1 or v2c: Enter the SNMP Community string you configured on the device.
      • If using SNMPv3: You'll need to provide:
        • Auth Level (e.g., authPriv for authentication and encryption, authNoPriv for authentication only, noAuthNoPriv)
        • Auth Username
        • Auth Protocol (MD5 or SHA)
        • Auth Password
        • Privacy Protocol (DES, AES) - if using authPriv
        • Privacy Password - if using authPriv
    • Poller Group: (Advanced) For now, leave as default (group 0).
    • Force add even if ICMP / SNMP check fails: Usually leave unchecked. If checked, LibreNMS will add the device even if it can't immediately ping it or get SNMP data, which can be useful if you know the device will come online later or has ICMP blocked.
    • Attempt to use a pre-set SysDescr / OS: (Advanced) Usually leave as default.
  4. Click Add Device.

Verifying Device Addition and Data Polling

After clicking "Add Device":

  • LibreNMS will attempt to contact the device using ICMP (ping) and then SNMP.
  • If successful, the device will be added, and you'll be taken to its overview page. You should see a message like "Device added successfully."
  • Initially, many graphs will be empty or show "No data." LibreNMS polls devices at regular intervals (typically every 5 minutes via the cron job). You need to wait for a few polling cycles (5-15 minutes) for data to start appearing.
  • Check Device Overview: The device's overview page will start populating with information like System Uptime, OS, Hardware, CPU, Memory, and Storage details as data is collected.
  • Check Graphs: Click on the "Graphs" tab for the device. You should see graphs for CPU, memory, network interfaces, etc., begin to show data points after a few polling cycles.
  • Check Event Log: Go to Logs > Event Log. You should see entries related to the new device being discovered, and pollers collecting data. Search for the hostname or IP of the device you added.
    • Look for messages like "Device snmp reachable" or "Polled in X.XX seconds".
    • If you see errors like "SNMP error" or "Timeout: No Response from..." then there's a problem with SNMP communication (wrong community string, firewall blocking, SNMP agent not running or misconfigured on the device).
  • Device Status: On the device overview page, the status should eventually show as "Up."

If the device doesn't appear or shows errors:

  • Double-check the SNMP configuration on the target device (community string, listening address, allowed hosts/IPs).
  • Verify firewall rules on the target device and any network firewalls between LibreNMS and the device.
  • Use snmpwalk from the LibreNMS server's command line to test SNMP connectivity to the target device, using the same IP, version, and community string/credentials you entered in LibreNMS. This is the most direct way to troubleshoot SNMP issues.
    # From LibreNMS server
    snmpwalk -v2c -c YourCommunityString target_device_ip .
    # The trailing dot (.) tells snmpwalk to try and walk the entire MIB tree.
    # Or, for a specific MIB like 'system':
    # snmpwalk -v2c -c YourCommunityString target_device_ip system
    
  • Check the LibreNMS logs (Logs > Event Log and Logs > LibreNMS Log) for more detailed error messages.

Workshop Monitoring Your First Linux Server

Objective: To configure SNMP on a Linux server (this could be the LibreNMS server itself or another Linux VM/host) and add it to LibreNMS for monitoring.

Prerequisites:

  • Your LibreNMS installation is up and running.
  • Access to a Linux server to be monitored. For simplicity, we can use the LibreNMS server itself, as we configured snmpd on it during the installation workshop. If you used a different community string for the LibreNMS server's own snmpd than the default "public", make a note of it. If you haven't configured snmpd on the LibreNMS server yet, follow the steps in "Configuring SNMP on a Linux Host" above for localhost.
  • Alternatively, set up another Linux VM (e.g., Ubuntu Server) and configure snmpd on it as described above, ensuring it's reachable from your LibreNMS server and firewalls are adjusted.

Let's assume you will monitor the LibreNMS server itself (localhost).

Tasks:

  1. Verify snmpd on the LibreNMS Server (localhost):

    • During the installation workshop, we configured snmpd on the LibreNMS server. The default community string in /opt/librenms/snmpd.conf.example is often RANDOMSTRINGGOESHERE. Let's assume you changed this to librenmscommunity (or use whatever you set).
    • Log in to your LibreNMS server via SSH.
    • Test snmpd locally:
      snmpwalk -v2c -c librenmscommunity localhost system
      
      You should see output. If not, review /etc/snmp/snmpd.conf on the LibreNMS server, ensure the community string is correct, agentaddress is listening (e.g., udp:161), and restart snmpd (sudo systemctl restart snmpd).
  2. Add "localhost" (the LibreNMS server) to LibreNMS Monitoring:

    • Open the LibreNMS web interface.
    • Navigate to Devices > Add Device.
    • Hostname: localhost (LibreNMS will resolve this to 127.0.0.1 for polling).
    • SNMP Version: v2c.
    • SNMP Community: librenmscommunity (or the community string you configured for snmpd on the LibreNMS server).
    • Leave other fields at their default values unless you know you need to change them.
    • Click Add Device.
  3. Observe Device Discovery and Polling:

    • You should be redirected to the device page for localhost. Initially, it might say "Device added, awaiting first poll."
    • Wait a few minutes (up to 5-10 minutes for the first poll and discovery to complete).
    • Refresh the page. You should start seeing information populate:
      • OS: (e.g., Linux)
      • Hardware: (e.g., KVM, VMware Virtual Platform, or physical hardware info)
      • System Uptime
      • CPU, Memory, Storage graphs should begin to appear.
    • Navigate to the Graphs tab for localhost. You should see various graphs. If they say "No data", wait a bit longer. Polling happens every 5 minutes by default.
    • Check the Event Log (Logs > Event Log, filter by localhost if needed). You should see entries indicating successful polling.
  4. (Optional) Add Another Linux Server:

    • If you have another Linux VM or physical server:
      • Install and configure snmpd on it as described in "Configuring SNMP on a Linux Host."
        • Use a unique community string (e.g., anotherlinuxsecret).
        • Ensure the agentaddress allows connections from your LibreNMS server IP.
        • Configure the firewall on that Linux server to allow UDP 161 from the LibreNMS server IP.
      • Test with snmpwalk from the LibreNMS server's command line to this new Linux server's IP:
        snmpwalk -v2c -c anotherlinuxsecret other_linux_server_ip system
        
      • If snmpwalk works, add it in LibreNMS (Devices > Add Device) using its IP address and the community string anotherlinuxsecret.
      • Observe it being polled.

Verification:

  • The localhost device (and any other Linux server you added) should show an "Up" status in LibreNMS.
  • You should see data populating in the overview and graphs sections for the device(s).
  • The Event Log should show successful polling events for the device(s).

Congratulations! You've successfully added your first device(s) to LibreNMS and are now collecting monitoring data. This is the foundational step for all further network monitoring activities. Explore the different tabs (Overview, Health, Graphs, Ports, etc.) for the device you added to see the wealth of information LibreNMS can collect.

Intermediate LibreNMS Management

With LibreNMS installed and your first devices added, it's time to delve deeper into its capabilities. This section covers navigating the interface effectively, managing your devices in more detail, setting up crucial alerts, and understanding how LibreNMS visualizes data. These skills will allow you to transform LibreNMS from a simple data collector into a proactive monitoring tool.

5. Exploring the LibreNMS Interface

The LibreNMS web interface is packed with information. Learning to navigate it efficiently is key to leveraging its power. The UI is generally intuitive, but knowing where to find specific details can save a lot of time.

Dashboard Overview

When you log in, you typically land on a dashboard. LibreNMS allows for multiple customizable dashboards.

  • Default Dashboard: The initial dashboard usually provides a high-level overview:
    • Device status counts (Up, Down, Ignored, Disabled).
    • Recent events from the Event Log.
    • System information about the LibreNMS server itself.
    • Graphs showing overall network traffic, top interfaces, etc. (if configured).
  • Creating Custom Dashboards:
    • Navigate to Overview > Dashboards > Manage Dashboards.
    • You can create new dashboards tailored to specific needs (e.g., a dashboard for critical servers, another for network core devices).
    • When viewing a dashboard, click the "Edit Dashboard" (pencil icon) to add, remove, or rearrange widgets.
    • Widgets: These are the building blocks of dashboards. LibreNMS offers a wide variety of widgets:
      • Device status summaries
      • Specific graphs from any device/port
      • Alert tables
      • Top X (e.g., top CPU users, top bandwidth consumers)
      • Maps (if locations are set)
      • Status indicators
      • And many more.
  • Switching Dashboards: Use the Overview > Dashboards menu to switch between your available dashboards.

Spend time experimenting with dashboards. A well-designed dashboard can give you an immediate snapshot of your network's health.

  • All Devices Page (Devices > All Devices):

    • This is your central list of all monitored devices.
    • It's a sortable and searchable table showing hostname, IP address, OS, uptime, status, and often some key performance indicators.
    • You can filter devices by various criteria (e.g., OS, device group, hardware type).
    • Clicking on a device hostname takes you to its individual Device Overview page.
  • Device Overview Page:

    • This page provides a comprehensive summary for a single device.
    • Header: Shows device name, IP, sysName, status, and icons for various actions (Edit, Delete, Rediscover, Graphs, Alerts, etc.).
    • Tabs: The information is organized into tabs:
      • Overview: General information, key metrics (CPU, Memory, Storage), recent events for this device.
      • Health: Detailed health sensors (temperature, fans, power supplies, etc., if supported by the device).
      • Graphs: A collection of all standard and custom graphs for this device. You can select specific metrics and time ranges.
      • Ports: Lists all network interfaces on the device. For each port, you can see its status, speed, traffic counters, and access detailed graphs.
      • VLANs, VRFs, IP Addresses: Network-specific information.
      • Routing: BGP, OSPF, EIGRP information if the device is a router and these protocols are polled.
      • Inventory: Hardware and software inventory details if supported (e.g., serial numbers, firmware versions).
      • Wireless: Information for wireless access points or controllers.
      • Load Balancer: Data for load balancing services.
      • Applications: Data from application-specific pollers (e.g., Apache, MySQL).
      • Logs: Event log, alert log, and syslog specific to this device.
      • Alerts: Current active alerts and alert history for this device.
      • Edit: Takes you to the device settings page where you can change SNMP community, polling settings, assign to groups, etc.
  • Ports Page (Networking > Ports):

    • Lists all monitored interfaces across all devices.
    • Searchable and filterable (e.g., show all down ports, show all 10Gbps ports).
    • Clicking a port name takes you to its detailed page with graphs, statistics, and settings.
  • Graphs:

    • Graphs are central to LibreNMS. They visualize time-series data stored in RRD files.
    • Accessible from device pages, port pages, or directly via Graphs in the main menu (for aggregated or specific graphs).
    • Time Range Selection: Most graphs allow you to select the time period (e.g., last hour, day, week, month, year, custom range).
    • Zoom and Pan: Interactive graphs often allow zooming into specific periods.
    • Data Aggregation: For longer time periods, RRDtool aggregates data (e.g., showing averages instead of raw 5-minute samples) to keep RRD file sizes manageable. This is why graphs might look "smoother" over longer periods.

Logs and Event Management

LibreNMS maintains several logs that are crucial for understanding system behavior and troubleshooting.

  • Event Log (Logs > Event Log):

    • This is a primary log for significant occurrences.
    • Records events like:
      • Device up/down status changes.
      • Port up/down status changes.
      • Alerts being triggered or cleared.
      • Device discovery and polling successes/failures.
      • Configuration changes.
    • Filterable by device, event type, severity, and time range.
    • Regularly reviewing the Event Log is good practice.
  • Alert Log (Logs > Alert Log):

    • Specifically lists triggered alerts and their history.
    • Shows when an alert was raised, acknowledged, and cleared.
  • Syslog (Logs > Syslog):

    • If you configure devices to send syslog messages to LibreNMS, they will appear here.
    • LibreNMS can parse these logs and even trigger alerts based on syslog content.
    • Requires enabling the syslog receiver in LibreNMS global settings and configuring devices.
  • LibreNMS Log (/opt/librenms/logs/librenms.log on the server):

    • This is the application log file. It contains detailed debugging information, errors, and operational messages from the LibreNMS backend processes (poller, discovery, web UI).
    • Accessible via the command line on the server or sometimes through a "LibreNMS Log" viewer in the UI (Logs > LibreNMS Log if enabled and accessible by the web user). This log is invaluable for deep troubleshooting.

Understanding Polling and Discovery

These are two fundamental background processes in LibreNMS.

  • Polling (poller.php):

    • Responsible for actively collecting data from already known devices.
    • Runs at regular intervals (default: every 5 minutes) via a cron job.
    • For each device, it queries configured SNMP OIDs and other data sources.
    • Updates RRD files with new data points.
    • Checks device and port statuses.
    • Evaluates alert rules based on the newly collected data.
    • You can see poller performance statistics under Health > Poller Performance. This helps identify slow-polling devices or overloaded pollers.
    • A single poller run should ideally complete within the polling interval (e.g., under 5 minutes). If it takes longer, you might need to optimize, add more pollers (distributed polling), or reduce the number of polled metrics.
  • Discovery (discovery.php):

    • Responsible for finding new devices and updating information about existing ones.
    • Auto-discovery: Can scan network ranges (defined in global settings or via config.php) for new SNMP-enabled devices.
    • Neighbor Discovery: Uses protocols like CDP, LLDP, OSPF, BGP to find connected devices based on information from already monitored devices.
    • Updates: For existing devices, discovery checks for new interfaces, changes in system description, new hardware sensors, etc.
    • Runs less frequently than the poller (default: every 6 hours, but this can be configured).
    • You can manually trigger discovery for a specific device from its "Edit" page.
    • Newly discovered devices are typically added automatically if auto-discovery is enabled and they respond to configured SNMP communities.

Understanding the distinction and timing of these processes is key to knowing when to expect new data or device updates.

Workshop Navigating and Understanding LibreNMS UI

Objective: To become comfortable navigating the LibreNMS web interface, finding key information about devices, and understanding where to look for logs and system health.

Prerequisites:

  • A running LibreNMS instance with at least one device monitored (e.g., localhost from the previous workshop).
  • Admin access to LibreNMS.

Tasks:

  1. Explore the Dashboard:

    • Log in to LibreNMS. Observe the default dashboard.
    • Identify the "Device Summary" widget. What does it tell you?
    • Locate the "Event Log" widget (if present). What kind of events do you see?
    • Try to customize the dashboard:
      • Click the Pencil Icon (Edit Dashboard) at the top right of the dashboard content area.
      • Click Add Widget. Browse the available widgets.
      • Add a "Device CPU Usage" graph widget. You might need to select your localhost device.
      • Try adding a "Clock" widget.
      • Drag and drop widgets to rearrange them.
      • Click Stop Editing (Floppy Disk/Save icon or similar).
  2. Navigate to Device Details:

    • Go to Devices > All Devices.
    • Click on your localhost device (or any other device you have added).
    • On the Device Overview page for localhost:
      • Identify the OS, Hardware, Uptime.
      • Look at the CPU, Memory, and Storage sections.
    • Click the Graphs tab.
      • Find the "CPU Usage" graph. Change the time range (e.g., "Last 6 hours", "Last 24 hours").
      • Find a network interface graph (e.g., lo or your main ethernet interface). Observe the traffic.
    • Click the Ports tab.
      • Identify the network interfaces listed. Click on one (e.g., eth0 or ens18 or lo).
      • Review the port details and its specific graphs.
    • Explore other tabs like Health (if your device has sensors like temperature) and Inventory.
  3. Examine Logs:

    • Go to Logs > Event Log.
      • Look for events related to localhost being polled or discovered.
      • Try to filter the log by "Device" and select localhost.
      • Note the timestamps and messages.
    • Go to Logs > Alert Log. This will likely be empty unless you've configured alerts and one has triggered.
    • (If you configured syslog forwarding to LibreNMS and your device is sending syslogs) Go to Logs > Syslog and see if any messages appear.
  4. Check Poller and Discovery Information:

    • Go to Health > Poller Performance.
      • Observe the "Overall" poller statistics. Note the "Last Polled" and "Poll Duration."
      • Look for your localhost device in the list of polled devices. How long did it take to poll?
    • Go to Global Settings (Gear Icon) > System > Poller > Discovery settings.
      • Note the "Discovery interval" (default is usually 21600 seconds = 6 hours).
      • This tells you how often LibreNMS checks for new devices or updates to existing ones.
    • Trigger a manual rediscovery for localhost:
      • Go to localhost device page.
      • Click the Cog icon (Edit) on the device header (or find "Edit" in a menu).
      • Under Device Settings, find a button or link for Rediscover Device or similar within the "Capture" or "Discovery" debug sections. (UI may vary slightly, sometimes it's under a "Capture" sub-menu for the device).
      • Alternatively, from Devices > All Devices, you can select the device and choose "Rediscover" from an action menu.
      • After triggering, check the Event Log for discovery messages related to localhost.
  5. Use the Search Function:

    • At the top of the LibreNMS interface, there's a search bar.
    • Try searching for localhost. It should lead you to the device page.
    • If you had many devices, you could search by IP, hostname, or even parts of sysDescr.

Deliverables/Reflection:

  • A custom dashboard with at least two new widgets.
  • Ability to locate CPU and network traffic graphs for a specific device and change their time range.
  • Understanding of where to find the Event Log and basic poller performance information.
  • Successful manual rediscovery of a device.

This workshop aims to build your confidence in using the LibreNMS UI. The more familiar you are with it, the quicker you can diagnose issues and extract valuable insights.

6. Device Management

As your monitored environment grows, effective device management becomes crucial. This involves organizing devices, customizing their polling behavior, managing credentials securely, and efficiently adding devices in bulk.

Device Groups

Device groups allow you to categorize devices based on various criteria, such as location, function, OS type, or customer. This is useful for:

  • Organization: Keeping your device list tidy and manageable.
  • Permissions: Granting users access to specific groups of devices.
  • Targeted Alerting: Creating alert rules that apply only to devices within a particular group.
  • Reporting: Generating reports for specific device sets.
  • Dashboard Filtering: Creating dashboards that show data only from certain groups.

Creating and Managing Device Groups:

  1. Navigate to Devices > Groups > Add Group.
  2. Name: Give the group a descriptive name (e.g., "Critical Servers," "Core Routers," "London Office").
  3. Description: Optional, provide more details about the group.
  4. Type (Dynamic Groups - Optional but Powerful):
    • Static Groups: You manually assign devices to these groups.
    • Dynamic Groups: Devices are automatically assigned to these groups based on rules you define. This is very powerful.
      • Click "Add rule".
      • Select a device attribute (e.g., sysName, os, location, hardware).
      • Choose a condition (e.g., contains, equals, matches regex).
      • Enter a value.
      • Example: Create a dynamic group "Linux Servers" with a rule: os equals linux. All devices identified with the OS "linux" will automatically be added to this group.
      • Example: Create a dynamic group "Cisco Routers" with rules: hardware contains cisco AND sysDescr contains IOS.
  5. Click Add Group.

Assigning Devices to Static Groups:

  • Go to the device's Edit page.
  • Find the "Groups" section.
  • Select the desired group(s) from the list and save.

Viewing Devices by Group:

  • Go to Devices > Groups. Click on a group name to see its members.
  • The "All Devices" page can often be filtered by group.

Customizing Device Polling Settings

While global polling settings apply by default, you can override many of them on a per-device basis.

  1. Navigate to the device's Edit page.
  2. Go to the SNMP tab (or similar, depending on LibreNMS version structure):
    • Override SysName: Use a custom name for the device in LibreNMS instead of the SNMP sysName.
    • SNMP Version, Port, Timeout, Retries: You can change these if this specific device uses different SNMP parameters than the global defaults.
    • SNMP Community/Auth Details: Update credentials if they change for this device.
  3. Go to the Modules or Polling tab:
    • Enable/Disable specific poller modules: For example, if a server doesn't have BGP configured, you can disable the BGP poller module for that device to save polling time. Conversely, you might enable a specific application poller module only for relevant devices.
    • Polling Interval: While global polling interval is set in config.php or via cron, some aspects or individual pollers might have configurable frequencies or enable/disable toggles here. Generally, the main 5-minute interval is system-wide.
    • Discovery and Poller Debugging: Options to run discovery or poller for this device immediately and see the output.

Important considerations:

  • Polling Load: Disabling unnecessary modules for devices reduces the load on the poller and the device itself.
  • Consistency: While per-device customization is powerful, strive for consistency where possible to simplify management. Use device groups and dynamic rules to manage module assignments if patterns exist.

Managing Device Credentials

Securely managing SNMP community strings and SNMPv3 credentials is vital.

  • SNMPv2c Community Strings:
    • Uniqueness: Use unique community strings for different sets of devices or security zones. Avoid using "public" or "private" in production.
    • Access Control: On the devices themselves, configure SNMP ACLs to restrict access to only the LibreNMS poller IP(s).
    • LibreNMS Configuration: When adding/editing a device, you enter the community string. These are stored in the LibreNMS database.
  • SNMPv3 Credentials:
    • These include username, authentication protocol/password, and privacy protocol/password.
    • Offer significantly better security than v2c.
    • When adding/editing a device, select SNMPv3 and fill in all required fields.
  • Global SNMP Credentials:
    • In Global Settings > System > Poller > SNMP Settings, you can define a list of default SNMPv2c community strings and SNMPv3 credentials.
    • When LibreNMS discovers a new device, it will try these default credentials in order. This can simplify adding devices if you use a few standard credential sets.
    • However, for maximum security, it's often better to specify credentials explicitly when adding a device or use per-device settings.

Best Practices:

  • Prefer SNMPv3 where supported.
  • Use strong, unique community strings or SNMPv3 passphrases.
  • Regularly review and rotate credentials.
  • Limit SNMP access on devices to only the LibreNMS poller IP(s).
  • Ensure your LibreNMS server itself is well-secured, as it stores these credentials.

Bulk Device Import

Adding devices one by one via the UI is fine for a few devices, but for many devices, it's inefficient. LibreNMS offers ways to add devices in bulk:

  1. Using the addhost.php CLI script:

    • Located in /opt/librenms/addhost.php.
    • Allows you to add a device from the command line.
    • You can script this to add multiple devices from a list (e.g., a CSV file).
    • Example usage:
      # Run as librenms user or sudo
      cd /opt/librenms
      ./addhost.php <hostname_or_ip> [community] [v1|v2c|v3] [port] [udp|tcp]
      # Example for SNMPv2c
      sudo -u librenms ./addhost.php 192.168.1.200 MyCommunityv2c v2c
      
    • To bulk add, you could write a simple shell script:
      #!/bin/bash
      # devices.csv format: hostname,community,version
      # e.g.:
      # server1.example.com,comm1,v2c
      # switch2.example.com,comm2,v2c
      
      INPUT_FILE="devices.csv"
      LIBRENMS_DIR="/opt/librenms"
      
      if [ ! -f "$INPUT_FILE" ]; then
          echo "Input file $INPUT_FILE not found!"
          exit 1
      fi
      
      while IFS=, read -r host comm ver; do
          echo "Adding device: $host with community $comm, version $ver"
          sudo -u librenms "$LIBRENMS_DIR/addhost.php" "$host" "$comm" "$ver"
      done < "$INPUT_FILE"
      
  2. Auto-Discovery with Network Scanning:

    • Configure network ranges for discovery in Global Settings > System > Poller > Discovery settings > Networks to auto-discover.
    • Provide a list of SNMP community strings to try in Global Settings > System > Poller > SNMP Settings.
    • LibreNMS will scan these networks during its discovery cycle and attempt to add any responsive SNMP devices using the provided communities.
    • This is good for initial population but can be less controlled.
  3. Using the API:

    • LibreNMS has a powerful API that can be used to add devices programmatically. This is suitable for integration with automation tools or CMDBs.
    • Requires generating an API token and using tools like curl or scripting languages (Python, PowerShell) to make API calls. (More on API in Advanced section).

When importing in bulk, ensure SNMP is already configured on the target devices and they are reachable from the LibreNMS server.

Workshop Organizing Devices with Groups

Objective:
To create both static and dynamic device groups and assign devices to them. This workshop assumes you have at least two devices added to LibreNMS (e.g., localhost and one other Linux VM, or you can add localhost twice with different hostnames if you only have one physical machine, just for grouping practice, though LibreNMS might de-duplicate). For better effect, try to have devices with slightly different characteristics (e.g., one identified as 'Linux', another perhaps you can manually edit its OS field for this exercise).

Prerequisites:

  • LibreNMS running with at least two devices. If you only have localhost, consider adding it again using 127.0.0.1 as the hostname so you have two entries to play with for grouping.
    • Device 1: localhost (OS: Linux)
    • Device 2: (Optional) Another Linux VM, or if not available, add 127.0.0.1 as a new device. You can edit this device's OS manually via Device Edit > Misc > Override OS to something like "Generic Device" or "TestOS" for the sake of creating different dynamic group conditions.

Tasks:

  1. Create a Static Device Group:

    • Navigate to Devices > Groups.
    • Click Add Group.
    • Name: Lab Servers
    • Description: Servers used in the test lab environment.
    • Leave "Rules for dynamic group" empty.
    • Click Add Group.
  2. Manually Assign a Device to the Static Group:

    • Go to Devices > All Devices.
    • Click on your localhost device.
    • In the device header, click the Cog icon (Edit).
    • Scroll down to the Groups section (or a similar section for group assignment).
    • You should see "Lab Servers" in the list of available groups. Select it (it might be a multi-select box or checkboxes).
    • Click Save Changes (or "Update Device").
    • Verify: Go back to Devices > Groups, click on "Lab Servers". localhost should be listed as a member.
  3. Create a Dynamic Device Group for "Linux Devices":

    • Navigate to Devices > Groups.
    • Click Add Group.
    • Name: All Linux Machines
    • Description: Automatically groups all devices running Linux OS.
    • In the "Rules for dynamic group" section:
      • Click Add rule.
      • First dropdown: Device Attribute -> Select os.
      • Second dropdown: Condition -> Select equals.
      • Text field: Value -> Type linux (lowercase, as LibreNMS usually standardizes OS names).
    • Click Add Group.
    • Verification:
      • Go back to Devices > Groups. Click on "All Linux Machines".
      • Any device LibreNMS has identified with the OS "linux" (like your localhost) should automatically appear in this group. This might take a few moments or until the next discovery/poller run that updates group memberships. You can try running php ./build-base-dynamic-groups.php as the librenms user from /opt/librenms to force an update.
  4. Create another Dynamic Device Group (Example: Based on Hostname):

    • Navigate to Devices > Groups.
    • Click Add Group.
    • Name: Localhost Devices
    • Description: Devices with 'local' in their hostname.
    • In the "Rules for dynamic group" section:
      • Click Add rule.
      • Device Attribute -> Select sysName (or hostname if available, sysName is generally more reliable from SNMP).
      • Condition -> Select contains.
      • Value -> Type local.
    • Click Add Group.
    • Verification: Your localhost device should appear in this group. If you added 127.0.0.1 and its sysName is also 'localhost', it should appear too.
  5. Explore Device Group Usage:

    • Go to Devices > All Devices.
    • Look for a filter option for "Group." Try filtering by "Lab Servers" and then by "All Linux Machines."
    • Consider how these groups could be used in dashboard widgets (many widgets allow filtering by device group) or in alert rules.

Deliverables/Reflection:

  • At least one static device group created with a device manually assigned.
  • At least one dynamic device group created (e.g., "All Linux Machines") that automatically populates based on device attributes.
  • Understanding of how to view devices within a group and filter by group.

Device groups are a powerful organizational tool. Using dynamic groups effectively can save a lot of manual effort as your LibreNMS deployment grows.

7. Alerting and Notifications

Monitoring data is useful, but its true power comes from proactive alerting when things go wrong. LibreNMS has a flexible and powerful alerting system that can notify you of issues before they escalate or affect users.

Understanding Alert Rules

Alert rules are sets of conditions that, when met, trigger an alert. LibreNMS evaluates these rules against the data collected by pollers.

  • Key Components of an Alert Rule:

    • Name: A descriptive name for the alert rule (e.g., "Server Down," "High CPU Usage Critical," "Interface Errors").
    • Severity: (Optional) Can categorize alerts (e.g., Critical, Warning, Info).
    • Device/Entity Association: Rules can be associated with:
      • All devices.
      • Specific device groups.
      • Specific devices.
      • Specific entities on devices (e.g., a particular port, sensor, application).
    • Conditions: The core logic. A rule triggers if "all" conditions are met OR if "any" condition is met (configurable).
      • Each condition checks a metric (e.g., device_status, processors.usage, mempools.perc, ports.ifInErrors_rate).
      • Against a comparison operator (e.g., equals, greater than, less than, matches regex, not equal to).
      • And a value.
    • Delay: How long the condition must be true before an alert is triggered (e.g., trigger if a device is down for 10 minutes, not just a brief blip). This helps reduce alert noise from transient issues.
    • Interval: How often to re-notify if the alert condition persists (e.g., remind every 1 hour).
    • Alert Transports: Where to send the notification (e.g., email, Slack, Telegram).
  • Types of Alert Checks: LibreNMS comes with a rich set of built-in alert check types covering various aspects:

    • Device Status: device_down (checks devices.status), device_rebooted.
    • Sensor Data: sensors.sensor_value (for temperature, humidity, voltage, etc.).
    • Resource Usage: CPU (processors.processor_usage), memory (mempools.mempool_perc), storage (storage.storage_perc).
    • Port/Interface Metrics: ports.ifOperStatus (port down), ports.ifInErrors_rate (input errors), ports.ifOutErrors_rate (output errors), ports.ifHCInOctets_rate (traffic rate).
    • BGP Sessions: bgpPeers.bgpPeerAdminStatus.
    • Applications: Specific metrics for monitored applications.
    • Syslog: syslog.pri or syslog.msg (trigger alerts based on syslog messages).
    • And many more. You can see available metrics when building a rule.

Creating Basic Alert Rules

Let's walk through creating a common alert rule, for example, a "Device Down" alert.

  1. Navigate to Alerts > Alert Rules.
  2. Click Create alert rule.
  3. You'll be presented with options:
    • "From an existing template": LibreNMS provides several useful pre-built templates. This is often the easiest way to start.
    • "From scratch based on a device metric": If you want to build a custom rule not covered by templates.

Example: Creating a "Device Down" Alert from Scratch (or by adapting a template):

  • Rule Name: Device Unreachable (Ping/SNMP)
  • Severity (Optional): Critical
  • Query builder / Advanced (Manual query): The query builder is more user-friendly.
  • Device Association:
    • "Associate with:" -> All Devices (or select a specific Device Group like "Critical Servers").
  • Conditions:
    • Match all rules (AND) or Match any rule (OR). For device down, AND is typical if you check multiple things.
    • Rule 1:
      • Metric: devices.status (This indicates overall device reachability; 0 usually means down)
      • Condition: equals
      • Value: 0 (or sometimes down depending on the specific check type's output)
    • (Alternatively, many use macros.device_down which is a pre-defined condition that is true if the device is considered down by LibreNMS.)
      • Metric: macros.device_down
      • Condition: equals
      • Value: 1 (or true)
  • Alert if condition is true for (Delay): 10 minutes (This means the device must be down for 10 consecutive minutes before the alert fires. Adjust as needed. 0 means alert immediately).
  • Send recovery notification: Check this if you want a notification when the device comes back up.
  • Max alerts / Reminders:
    • Max alerts to send: (e.g., 5, then stop sending for this occurrence until it recovers).
    • Remind after X alerts: (e.g., remind every 1 alert, meaning send every interval).
    • Interval between reminders: 1 hour (If the device is still down, send another notification every hour).
  • Transports: Select the notification transport (e.g., "default_email" or a specific transport you've configured).
  • Enable rule: Ensure this is checked.
  • Click Save Rule.

LibreNMS comes with many default alert rule templates that you can enable and customize. Look under Alerts > Alert Rules > Create alert rule > From an existing template. Examples:

  • Device down
  • Port down
  • High CPU/Memory/Storage
  • High Temperature

Configuring Alert Transports

Alert transports define how you receive notifications. LibreNMS supports numerous transports.

  1. Navigate to Alerts > Transports.
  2. Click Create alert transport.
  3. Transport Name: A descriptive name (e.g., "Admins Email Group," "Network Team Slack").
  4. Transport Type: Select the desired method from the dropdown:
    • Mail: Standard email.
      • Default contact: If checked, uses the default email from global settings.
      • Email address(es): Enter one or more email addresses, comma-separated.
    • API: Send a POST/GET request to a custom API endpoint.
    • Boxcar, Discord, Flock, HipChat, IRC, Matrix, Microsoft Teams, PagerDuty, Pushover, Rocket.Chat, Slack, Syslog, Telegram, Twilio, Zammad, Opsgenie, VictorOps, Webhook, etc.
  5. Configuration Fields: Each transport type will have specific fields:
    • Mail: Usually just the email address(es). Ensure your server is configured to send mail (e.g., Postfix installed and configured, or using an SMTP relay in config.php).
    • Slack: Webhook URL, Channel, Icon Emoji, Bot Name.
    • Telegram: Bot API Token, Chat ID.
    • Follow the instructions provided for each transport type, which often involve getting API keys or webhook URLs from the respective services.
  6. Default Transport: You can mark one transport as the "default." If an alert rule doesn't specify a transport, it will use the default.
  7. Click Save Transport.

Testing Transports:

  • After creating a transport, there's usually a Test Transport button. Use it!
  • For email, ensure your LibreNMS server can send emails. This might involve installing and configuring a local MTA like Postfix or Exim, or configuring PHP to use an external SMTP server (often done in config.php for LibreNMS).
    • Example config.php for SMTP relay:
      // Mail configuration
      $config['email_backend']                        = 'smtp'; // mail, sendmail, smtp
      $config['email_from']                           = 'librenms@yourdomain.com';
      $config['email_smtp_host']                      = 'smtp.example.com';
      $config['email_smtp_port']                      = 587;    // 25, 465, 587
      $config['email_smtp_timeout']                   = 10;
      $config['email_smtp_secure']                    = 'tls';  // '', ssl, tls
      $config['email_smtp_auth']                      = true;
      $config['email_smtp_username']                  = 'smtp_user@example.com';
      $config['email_smtp_password']                  = 'smtp_password';
      

Alert Templates and Customization

Alert notifications can be customized using templates. These templates define the format and content of the message sent by transports.

  1. Navigate to Alerts > Templates.
  2. Click Create template.
  3. Name: A name for your template (e.g., "Detailed Device Down Email").
  4. Template Type: Recovery (for recovery messages) or Issue (for alert messages).
  5. Content: This is where you design the message body using HTML (for email) or plain text, along with special variables that LibreNMS will replace with actual data.
    • LibreNMS uses the Blade templating engine (or a similar syntax).
    • Common Variables:
      • {{ $alert->title }}: The title of the alert rule.
      • {{ $alert->hostname }}: The hostname of the device triggering the alert.
      • {{ $alert_status }}: "ALERT" or "RECOVERY".
      • {{ $alert->rule }}: Name of the alert rule.
      • {{ $alert->severity }}: Severity of the alert.
      • {{ $alert->timestamp }}: When the alert was triggered.
      • {{ $alert->state }}: Current state values that triggered the alert.
      • {{ $alert->message }}: A pre-formatted message from the alert check.
      • {{ $alert->location }}: Device location.
      • {{ $alert->faults }}: An array/collection of fault data (e.g., for port down, this would list the port). You might loop through this:
        @foreach ($alert->faults as $fault)
        Port: {{ $fault->ifName }} ({{ $fault->ifDescr }}) is {{ $fault->port_alert_state }}
        @endforeach
        
    • You can see a list of available variables when editing/creating a template, often in a sidebar or helper section.
  6. Assign to Alert Rules:
    • After creating a template, you can assign it to specific alert rules. When editing an alert rule, there's usually a section to select the "Alert Template" and "Recovery Template."
    • If no template is specified for a rule, a default system template is used.

Customizing templates allows you to provide more context, include links back to the device in LibreNMS, or format messages according to your organization's standards.

Workshop Setting Up Email Alerts for Down Devices

Objective: To configure an email transport and an alert rule to notify you via email when a monitored device goes down.

Prerequisites:

  • LibreNMS installed and monitoring at least one device (e.g., localhost).
  • Your LibreNMS server must be able to send emails. This is the trickiest prerequisite. You have a few options:
    1. Install and configure a local MTA (Mail Transfer Agent) like Postfix or Exim. This is a more involved server administration task. Postfix can be configured to send directly or relay through an external SMTP server.
    2. Use an external SMTP relay service (e.g., Gmail, SendGrid, Amazon SES, your ISP's SMTP server). This requires configuring SMTP settings in LibreNMS's config.php file. For this workshop, we'll assume you can get one of these working. If setting up mail is too complex right now, you can still go through the steps of creating the transport and rule, but the test/actual notification won't work. A simplified Postfix setup for local relay only (might not be deliverable to external emails without further config like SPF, DKIM, DMARC, and reverse DNS):
      # On your LibreNMS server (Ubuntu)
      sudo apt update
      sudo apt install -y postfix mailutils
      # During Postfix installation, select "Internet Site".
      # System mail name: use your server's FQDN or `localhost` if it's just for local.
      # Edit /etc/postfix/main.cf if needed, e.g., mynetworks to allow relay from localhost.
      # sudo systemctl restart postfix
      
      For using an external SMTP (like Gmail - requires "less secure app access" or an app password if 2FA is on): Edit /opt/librenms/config.php and add:
      <?php
      // Existing config...
      
      // Mail configuration
      $config['email_backend']                        = 'smtp';
      $config['email_from']                           = 'librenms-alerts@yourdomain.com'; // Use a real or desired from address
      $config['email_smtp_host']                      = 'smtp.gmail.com';
      $config['email_smtp_port']                      = 587;
      $config['email_smtp_timeout']                   = 10;
      $config['email_smtp_secure']                    = 'tls'; // For Gmail, STARTTLS
      $config['email_smtp_auth']                      = true;
      $config['email_smtp_username']                  = 'your_gmail_address@gmail.com';
      $config['email_smtp_password']                  = 'your_gmail_app_password_or_regular_password';
      
      // ... any other custom config
      
      Replace placeholders with your actual credentials/settings. Restart PHP-FPM if you change config.php. sudo systemctl restart php8.1-fpm

Tasks:

  1. Configure an Email Alert Transport:

    • In LibreNMS, navigate to Alerts > Transports.
    • Click Create alert transport.
    • Transport Name: Admin Email Alerts
    • Transport Type: Mail
    • Default transport: (Optional) Check if this should be your default.
    • Email address(es): Enter your personal email address where you want to receive test/actual alerts.
    • Click Save Transport.
    • After saving, you should see your new transport in the list. Click the Test Transport button (often a paper airplane icon or similar).
      • This will attempt to send a test email to the address you configured. Check your inbox (and spam folder).
      • If you don't receive it, troubleshoot your mail server configuration on the LibreNMS host or your SMTP settings in config.php. Check /opt/librenms/logs/librenms.log for errors related to mail sending.
  2. Enable/Create a "Device Down" Alert Rule:

    • Navigate to Alerts > Alert Rules.
    • Look for a pre-existing rule template named "Device Down" or "Devices up/down". If it exists:
      • Click the Toggle switch to enable it.
      • Click the Edit icon (pencil) to review and customize it.
    • If no suitable template exists, click Create alert rule.
      • Select "From an existing template" and look for "Devices up/down" or similar. If found, select it. This will pre-fill many fields.
      • If not, create from scratch:
        • Rule Name: Critical Device Down
        • Severity: Critical
        • Device Association: All Devices (or a specific group if you prefer).
        • Conditions:
          • Use Metric: macros.device_down, Condition: equals, Value: 1.
        • Alert if condition is true for (Delay): 1 minute (for faster testing; in production, use 5-15 minutes).
        • Send recovery notification: Check this box.
        • Max alerts / Reminders: Max alerts to send: 3. Interval between reminders: 10 minutes.
        • Transports: Select Admin Email Alerts (the transport you just created).
        • Enable rule: Ensure this is checked.
        • Click Save Rule.
  3. Test the Alert Rule:

    • This requires making a monitored device "go down."
    • Choose a test device: Your localhost device is a good candidate if it's the only one.
    • Simulate "Down" Status:
      • The easiest way to make localhost appear down to SNMP is to stop the snmpd service on the LibreNMS server itself.
        # On your LibreNMS server
        sudo systemctl stop snmpd
        
    • Wait for Polling and Alert Trigger:
      • LibreNMS polls every 5 minutes by default. The alert rule has a 1-minute delay. So, after stopping snmpd, you might need to wait up to 5-6 minutes for the alert to trigger.
      • Monitor Logs > Event Log. You should see localhost being marked as down (SNMP unreachable).
      • Monitor Alerts > Alerts. A new alert for Critical Device Down for localhost should appear.
      • Check your email. You should receive an alert notification.
  4. Test Recovery Notification:

    • Once you've received the "down" alert, bring the device "back up":
      # On your LibreNMS server
      sudo systemctl start snmpd
      
    • Wait for Polling and Recovery:
      • Again, wait for the next polling cycle (up to 5 minutes).
      • Monitor Logs > Event Log. localhost should be marked as up again.
      • Monitor Alerts > Alerts. The active alert for localhost should clear (or move to history).
      • Check your email. You should receive a recovery notification if you enabled it.
  5. Review and Cleanup:

    • If testing was successful, remember to:
      • Adjust the "Device Down" alert rule delay to a more reasonable production value (e.g., 10 minutes or 15 minutes).
      • Ensure snmpd is running and enabled on any devices you want to monitor.
      • Review the alert emails. Are they clear? Do they contain the information you need? If not, consider customizing the alert template later.

Deliverables/Reflection:

  • A configured email alert transport that successfully sends test emails.
  • An active "Device Down" alert rule associated with the email transport.
  • Confirmation (via email and LibreNMS UI) that alerts are triggered when a device goes down and recovery notifications are sent when it comes back up.
  • Understanding of the delay involved in alert triggering due to polling intervals and rule delays.

This workshop provides a critical capability: getting notified when issues arise. You can adapt these steps to create alerts for many other conditions (high CPU, low disk space, port errors, etc.).

8. Graphing and Data Visualization

LibreNMS excels at collecting vast amounts of time-series data. Its graphing capabilities, primarily powered by RRDtool, allow you to visualize this data, identify trends, troubleshoot issues, and plan for capacity.

Standard Graphs (CPU, Memory, Traffic, etc.)

Out-of-the-box, LibreNMS automatically generates a wide range of standard graphs for most devices it supports, provided the device exposes the necessary data via SNMP.

  • Accessing Graphs:

    • Device Level: Navigate to a device's page, then click the Graphs tab. This shows all available graphs for that specific device.
    • Port Level: Navigate to a device's Ports tab, click on an interface. The port details page will have its own set of graphs (traffic, errors, discards, etc.).
    • Global Graphs: The main Graphs menu item provides access to aggregated graphs or specific graph types across all devices (e.g., "Top CPU Usage," "Total Bandwidth - All Ports").
  • Common Standard Graphs:

    • System Uptime: Shows how long the device has been running.
    • CPU Usage: Percentage of CPU utilization, often per core or an aggregate.
    • Memory Usage: RAM and swap utilization (e.g., total, used, free, cached, buffered). Represented as percentages or absolute values.
    • Storage Usage: Disk space utilization for each filesystem/partition (e.g., total, used, free, percentage used).
    • Network Interface Traffic (ifOctets or ifHCOctets): Bits per second (bps) or Bytes per second (Bps) for input and output traffic on each monitored port. "HC" (High Capacity) counters are 64-bit and essential for interfaces faster than ~600 Mbps to prevent counter wrapping.
    • Network Interface Errors/Discards: Counts or rates of input/output errors and discards on interfaces. High error rates can indicate physical layer problems, duplex mismatches, or congestion.
    • Ping Response Time: Round-trip time for ICMP pings to the device.
    • SNMP Response Time: How long it takes to get an SNMP response from the device.
    • Health Sensors: Temperature, fan speed, voltage, humidity, etc., if the device has these sensors and LibreNMS supports polling them.
    • Specific Application Metrics: If application polling modules are enabled (e.g., Apache requests/sec, MySQL queries/sec).
  • Graph Features:

    • Time Range Selection: Crucial for analysis. You can view data from the last hour up to several years (depending on RRD configuration and data retention). Common presets: 6h, 24h, 2d, 1w, 1m, 3m, 6m, 1y, 2y. Custom ranges are also possible.
    • Interactive Elements: Some graphs allow hovering over data points to see exact values and timestamps. Zooming capabilities are often present.
    • Data Aggregation: RRDtool uses Round Robin Archives (RRAs) to store data at different resolutions. For recent data (e.g., last 24 hours), you might see 5-minute averages. For older data (e.g., last year), you might see hourly or daily averages. This is how RRDtool keeps file sizes fixed and predictable.
    • Graph Types: Most are line graphs. Stacked graphs are used for components of a total (e.g., used/free memory). Area graphs are also common.

Customizing Graphs

While standard graphs cover many needs, you can customize their appearance and sometimes what data they display.

  • Graph Settings (Per User or Global):
    • Some UI settings might allow users to choose default graph styles or color palettes, though deep customization of standard graph rendering often requires code changes or custom graph definitions.
  • Device-Specific Graph Overrides:
    • For certain graphs on a device, you might find options in the device's edit page or via config.php to tweak parameters (e.g., scaling, legends).
  • Creating Custom Graphs (Advanced):
    • If LibreNMS doesn't graph a specific OID you need, and it's not part of a standard MIB that LibreNMS auto-detects, you might need to:
      1. Ensure the OID is being polled: This might involve adding custom OIDs to device OS definitions or creating custom poller modules (covered in Advanced).
      2. Define a custom graph: This usually involves creating a YAML or PHP graph definition file that tells LibreNMS how to find the RRD data and how to render the graph (title, labels, colors, data sources). This is an advanced topic.
    • The community forums and documentation are good resources for examples of custom graph definitions.

Understanding RRDtool and Data Storage

RRDtool (Round Robin Database tool) is the backbone of LibreNMS's graphing. Understanding its basics helps interpret graphs correctly.

  • Round Robin Database (RRD):
    • RRD is a system to store and display time-series data. "Round robin" refers to its fixed-size nature. Old data is consolidated or overwritten to keep the database from growing indefinitely.
    • Each metric (e.g., CPU_usage, ifInOctets_for_port1) is typically stored in its own RRD file (e.g., /opt/librenms/rrd/<device_hostname>/cpu-XX.rrd or port-ifInOctets-XX.rrd).
  • RRD File Structure: An RRD file contains:
    • Data Sources (DS): Defines the type of data being stored (e.g., GAUGE for values like temperature or CPU %, COUNTER for continuously increasing values like traffic counters, DERIVE for rates of change). LibreNMS handles DS definitions based on MIB info.
    • Round Robin Archives (RRA): Defines how data is stored at different resolutions and for how long. An RRD file can have multiple RRAs.
      • Example RRA:
        • Store 5-minute averages for 1 day.
        • Store 30-minute averages for 1 week.
        • Store 2-hour averages for 1 month.
        • Store 1-day averages for 1 year.
      • When you request a graph for a specific time range, RRDtool selects the most appropriate RRA to fetch data from.
  • Data Consolidation: As new data comes in, it's fed into the highest resolution RRA. When that RRA is full, data points are consolidated (e.g., averaged) and fed into the next RRA. This is why older data appears less granular.
  • Graph Generation: When LibreNMS needs to display a graph, it calls rrdtool graph with parameters specifying the RRD file(s), time range, labels, colors, etc. RRDtool then generates an image (usually PNG).
  • rrdcached (RRD Caching Daemon):
    • Polling many devices generates frequent writes to RRD files. This can be I/O intensive, especially on slower disks.
    • rrdcached is a daemon that acts as a caching layer for RRD updates. Pollers write data to rrdcached, which then flushes these updates to disk in batches, improving performance and reducing disk I/O load.
    • LibreNMS highly recommends using rrdcached, and it's often set up during installation. Configuration is in /etc/default/rrdcached (or similar) and LibreNMS's config.php (e.g., $config['rrdcached'] = "unix:/var/run/rrdcached.sock";).

Implications of RRDtool: - Fixed Size: You know how much disk space RRDs will take over time. - No Raw Data for Old Points: You can't get the original 5-minute sample from a year ago if it's only stored as a daily average in the RRA. - "NaN" or Gaps: If a device is down or SNMP fails, no data is written to RRDs for that period. This appears as gaps (or "NaN" - Not a Number) in graphs.

Using Smokeping Integration (if applicable)

Smokeping is a separate open-source tool specialized in network latency measurement and visualization. LibreNMS can integrate with an existing Smokeping installation.

  • What Smokeping Does:
    • Sends out probe packets (ICMP echo, DNS queries, HTTP requests, etc.) to target hosts at regular intervals.
    • Measures round-trip time (RTT) and packet loss.
    • Generates detailed graphs showing latency distribution, median RTT, and loss over time. This is excellent for visualizing network stability and jitter.
  • LibreNMS Integration:
    • If you have Smokeping installed and configured to monitor hosts that are also in LibreNMS:
      • You can configure LibreNMS (in config.php) to point to your Smokeping installation.
      • LibreNMS will then display relevant Smokeping graphs directly within the LibreNMS device overview page for matched hosts.
      • This provides a convenient way to see detailed latency information alongside other device metrics.
    • Configuration example in config.php:
      $config['smokeping']['dir'] = '/var/lib/smokeping/'; // Path to Smokeping RRDs
      $config['smokeping']['imgdir'] = '/var/cache/smokeping/images/'; // Path to Smokeping images
      $config['smokeping']['url'] = 'http://your-smokeping-server/smokeping.cgi'; // URL to Smokeping web UI
      
  • Benefits: Adds a layer of latency-focused monitoring that complements LibreNMS's broader SNMP-based data collection.

Setting up Smokeping itself is outside the scope of this LibreNMS guide, but if you already use it or plan to, the integration is straightforward.

Workshop Analyzing Network Traffic Patterns

Objective: To use LibreNMS graphs to analyze network traffic patterns for a specific device interface, understand different time scales, and identify peak usage.

Prerequisites:

  • LibreNMS running with at least one device that has network interfaces generating some traffic (e.g., your localhost device, or better, a router/switch if you have one monitored).
  • The device should have been monitored for at least a few hours, preferably a day or more, to have some historical data.

Tasks:

  1. Identify a Target Interface:

    • Log in to LibreNMS.
    • Navigate to Devices > All Devices. Select a device.
    • Go to the Ports tab for that device.
    • Identify an active network interface that is likely to have some traffic (e.g., eth0, ens18 on a server, or a WAN/LAN port on a router). Avoid loopback (lo) unless it's your only option and has some traffic from local services.
    • Click on the name of the selected interface to go to its detail page.
  2. Examine Basic Traffic Graphs:

    • On the port detail page, you should see graphs for "Traffic" (bits per second or Bytes per second) and possibly "Packets" (packets per second).
    • Focus on the main "Traffic" graph (often labeled with ifHCOctets or ifOctets). It usually shows Inbound and Outbound traffic.
    • Observe the default time range (e.g., last 24 hours).
  3. Analyze Different Time Scales:

    • Short Term (e.g., Last 6 Hours):
      • Select "6 hour" from the time range options for the graph.
      • What do you observe? Are there any spikes? Is the traffic bursty or steady?
      • Hover your mouse over different points on the graph lines. What are the approximate Inbound and Outbound traffic rates at those points?
    • Medium Term (e.g., Last 24 Hours or Last 48 Hours):
      • Select "24 hour" or "2 day".
      • Can you identify daily patterns? For example, is traffic higher during business hours and lower at night? Are there regular peaks?
      • Note the Y-axis scale. Does it change as you change the time range? Why? (Because RRDtool might be using different RRAs with different consolidation, or the peak values differ).
    • Long Term (e.g., Last Week or Last Month):
      • Select "1 week" or "1 month" (if you have enough data).
      • Can you see weekly patterns? (e.g., higher traffic on weekdays vs. weekends).
      • Are there any overall trends, like gradually increasing traffic over the month?
      • Notice how the graph lines might appear "smoother" over longer periods. This is due to RRDtool's data aggregation.
  4. Identify Peak Usage:

    • Using the different time scales, try to identify the approximate peak Inbound and Outbound traffic rates for this interface.
    • When did these peaks occur?
    • What is the unit of the Y-axis (e.g., Mbps, Gbps, kbps)?
    • If this interface has a known speed (e.g., 1 Gbps), what percentage of its capacity was used during the peak? (e.g., if peak is 500 Mbps on a 1 Gbps link, that's 50% utilization).
  5. Examine Error/Discard Graphs (if data exists):

    • On the same port detail page, look for graphs related to "Errors" and "Discards" (e.g., ifInErrors, ifOutErrors, ifInDiscards, ifOutDiscards).
    • Is there any significant number of errors or discards?
    • Constant errors can indicate a physical layer issue (bad cable, faulty SFP), a duplex mismatch, or other problems.
  6. (Optional) Compare with another interface or device:

    • If you have other interfaces or devices monitored, repeat some of these steps for them. How do their traffic patterns differ?

Deliverables/Reflection (for your own notes):

  • A brief description of the traffic pattern for the chosen interface over 24 hours (e.g., "Traffic peaks between 9 AM and 5 PM, with an average of X Mbps and a peak of Y Mbps. Low traffic overnight.").
  • The approximate peak Inbound and Outbound bandwidth usage observed and when it occurred.
  • An observation about how graph granularity changes with different time scales.
  • A note on whether any significant interface errors or discards were observed.

This workshop helps you practice interpreting the visual data LibreNMS provides, which is a fundamental skill for network monitoring and management. Understanding these graphs allows you to assess network performance, plan capacity, and spot potential problems.

Advanced LibreNMS Customization and Optimization

Once you have mastered the basics and intermediate features of LibreNMS, you can explore its more advanced capabilities. This section covers extending LibreNMS with custom definitions and pollers, optimizing its performance for larger environments, implementing robust security practices, and effectively troubleshooting common issues.

9. Extending LibreNMS

LibreNMS is highly extensible, allowing you to tailor it to monitor virtually any device or application that exposes data. This often involves understanding MIBs, OIDs, and sometimes writing small scripts or configuration snippets.

Custom Device OS Definitions

LibreNMS uses "OS Definitions" to determine how to discover, poll, and interpret data from different types of devices. While it supports a vast number of OS types out-of-the-box, you might encounter a device that isn't fully recognized or for which you want to poll additional, specific MIBs.

  • Structure of OS Definitions:

    • OS definitions are typically YAML files located in /opt/librenms/LibreNMS/OS/ (or a custom directory specified in config.php).
    • Each YAML file defines:
      • os: The short name for the OS (e.g., linux, cisco-ios, mycustomos).
      • text: A human-readable description (e.g., "Linux", "Cisco IOS", "My Custom Device").
      • type: General category (e.g., server, network, firewall, wireless).
      • icon: An icon to display in the UI.
      • discovery:
        • sysObjectID: A list of SNMP sysObjectID values that identify this OS. LibreNMS uses this during discovery to match a device to an OS definition.
        • sysDescr: A list of regex patterns to match against the device's sysDescr string.
        • modules: A list of discovery modules to run for this OS.
      • mib_dir: (Optional) A directory containing custom MIB files specific to this OS. LibreNMS will attempt to load MIBs from here.
      • poller_modules / global_modules: Specifies which poller modules should be enabled or disabled by default for this OS.
      • graphs: A list of custom graph definitions to be made available for this OS.
      • bad_ifType, bad_ifName_regexp, good_ifAlias_regexp: Rules to filter out unwanted interfaces.
      • os_group: Assigns the OS to a group for easier management.
  • Adding Custom MIBs:

    • If your device uses vendor-specific MIBs not included with LibreNMS or net-snmp, you need to obtain the MIB files (usually from the vendor).
    • Place these MIB files in a directory (e.g., /opt/librenms/mibs/custom/).
    • Add this directory to the MIB search path in LibreNMS's config.php:
      $config['mib_dirs'][] = "/opt/librenms/mibs/custom/";
      
    • You may also need to configure net-snmp itself to be aware of these MIBs if you are using snmpwalk or other tools outside of LibreNMS that need to translate OIDs.
  • Creating or Modifying an OS Definition:

    1. Identify the device: Use snmpwalk -v2c -c YOUR_COMMUNITY YOUR_DEVICE_IP sysObjectID sysDescr to get the sysObjectID and sysDescr.
    2. Copy an existing YAML: Find an OS definition YAML file in /opt/librenms/LibreNMS/OS/ that is similar to your device and copy it to create a new file (e.g., mycustomos.yaml).
    3. Edit the YAML:
      • Change os, text, sysObjectID, and sysDescr to match your device.
      • Adjust poller modules, discovery modules, and graphs as needed.
      • If you want to poll specific OIDs not covered by standard modules, you might need to look into creating custom graph definitions or even custom poller modules.
    4. Test: After saving, run discovery for the device (./discovery.php -h <hostname> -d -m os) and then polling (./poller.php -h <hostname> -d -r -f) to see if it's correctly identified and if new data/graphs appear.

This process can be iterative. The LibreNMS community forums are a good place to ask for help or find examples for specific devices.

Application Monitoring (e.g., Apache, Nginx, MySQL)

LibreNMS can monitor specific applications running on your servers, not just OS-level metrics. This is typically achieved using "Application Pollers."

  • How it Works:

    • Application pollers are scripts (usually Bash or Python) that run on the LibreNMS server.
    • They connect to the target server (often using SNMP extend scripts, NRPE, or custom agent scripts deployed on the monitored host) to gather application-specific metrics.
    • The data is then returned to LibreNMS and stored in RRD files, similar to other SNMP data.
    • LibreNMS comes with pre-built application pollers for many common services like Apache, Nginx, MySQL, BIND, Memcached, NTP, Postfix, etc.
  • Enabling Application Monitoring:

    1. Agent Setup (on the monitored host):
      • For many applications, you need to install an agent script on the host running the application. These scripts are often provided by LibreNMS (in /opt/librenms/scripts/agent-local/ or via the librenms-agent repository).
      • Example: For MySQL, you might deploy a script that queries MySQL status variables.
      • These scripts need to be executable by the snmpd user or callable via another mechanism like NRPE.
      • Configure snmpd on the monitored host to expose the output of these scripts via an extend directive in snmpd.conf:
        # Example for a hypothetical mysql monitoring script
        extend mysql /opt/librenms-agent/check_mysql.sh
        
        This makes the script's output available via a specific OID that LibreNMS knows how to query.
    2. LibreNMS Configuration (on the LibreNMS server):
      • In the LibreNMS web UI, go to the Device Edit page for the server running the application.
      • Go to the Applications tab (or Modules tab, then find "Applications").
      • Enable the specific application poller (e.g., "MySQL," "Apache").
      • Some applications might require additional configuration (e.g., database credentials for MySQL, status URL for Apache/Nginx). These are usually configured directly in the agent script on the monitored host or sometimes passed via SNMP SETs if the agent supports it (less common).
  • Viewing Application Data:

    • Once configured and polled, application-specific metrics and graphs will appear under the Applications tab on the device's overview page in LibreNMS.
  • Creating Custom Application Pollers:

    • If LibreNMS doesn't have a poller for an application you need:
      1. Write a script that can collect the desired metrics from your application and output them in a simple key:value format (e.g., metric1:value1\nmetric2:value2).
      2. Deploy this script on the monitored host and configure snmpd (or another agent mechanism) to expose its output.
      3. In LibreNMS, you'll need to create a new application poller definition (usually a YAML file in /opt/librenms/LibreNMS/ rýchlo /Applications/). This YAML defines the OIDs to query (from your extend setup), how to parse the data, and what graphs to create. This is a more involved development task.

Writing Custom Pollers and Discovery Modules (Introduction)

For highly specialized devices or data sources not covered by existing mechanisms, you might need to write custom poller or discovery modules in PHP. This is an advanced topic requiring PHP programming skills and a good understanding of the LibreNMS architecture.

  • Discovery Modules (includes/discovery/):

    • These PHP scripts are responsible for identifying device characteristics, hardware components, sensors, etc.
    • They are executed during the discovery phase.
    • If you have a device with unique hardware components that LibreNMS doesn't recognize, you might write a discovery module to query specific MIBs and register these components (e.g., new types of sensors, power supplies, fans).
  • Poller Modules (includes/polling/):

    • These PHP scripts are responsible for collecting time-series data for specific metrics.
    • They are executed during the polling phase.
    • If you need to graph data from custom OIDs or a non-SNMP data source, you would write a poller module.
    • The module would fetch the data, format it, and then use LibreNMS functions like rrdtool_update_ng() to store it in RRD files.
    • You'd also need corresponding graph definitions to visualize this data.
  • General Process:

    1. Understand the Data: Know the OIDs or API endpoints to get the data.
    2. PHP Scripting: Write PHP code to fetch, parse, and process the data.
    3. Integration: Place your script in the appropriate LibreNMS directory and potentially update OS definitions or config.php to call your module.
    4. Graph Definitions: Create YAML graph definitions to display the data collected by your poller module.

Developing these modules requires careful study of existing LibreNMS modules and the internal API. The LibreNMS development documentation and community are vital resources.

Using the LibreNMS API

LibreNMS provides a comprehensive RESTful API that allows you to interact with it programmatically. This is useful for:

  • Automation (adding/deleting devices, managing users).
  • Integration with other systems (CMDBs, ticketing systems, custom dashboards).
  • Extracting data for custom reporting or analysis.

  • Enabling API Access:

    1. In the LibreNMS web UI, navigate to Gear Icon > API Settings > API Access.
    2. Click Create API access token.
    3. Give the token a description, assign it to a user (permissions will be based on this user), and set an expiration if desired.
    4. The generated token will be displayed once – copy it immediately and store it securely. You won't be able to see it again.
  • API Endpoints:

    • The API documentation is usually available directly from your LibreNMS instance at http://your-librenms-host/api/v0 (or a link from Gear Icon > API Settings).
    • It lists all available endpoints, expected parameters, and example responses.
    • Common endpoints include:
      • /devices (GET, POST, DELETE for managing devices)
      • /ports (GET for port information)
      • /health (GET for sensor data)
      • /alerts (GET for alert information)
      • /services (GET for monitored services/applications)
      • And many more.
  • Authentication:

    • API requests must include the API token in the X-Auth-Token HTTP header.
      curl -H "X-Auth-Token: YOUR_API_TOKEN_HERE" http://your-librenms-host/api/v0/devices
      
  • Example: Adding a device via API using curl:

    API_TOKEN="YOUR_API_TOKEN_HERE"
    LIBRENMS_URL="http://your-librenms-host"
    HOSTNAME_TO_ADD="new-server.example.com"
    COMMUNITY="public"
    VERSION="v2c"
    
    curl -X POST -H "X-Auth-Token: $API_TOKEN" \
    -H "Content-Type: application/json" \
    -d '{
        "hostname": "'"$HOSTNAME_TO_ADD"'",
        "snmp_community": "'"$COMMUNITY"'",
        "snmp_version": "'"$VERSION"'"
    }' \
    "$LIBRENMS_URL/api/v0/devices"
    

  • Rate Limiting: Be aware of API rate limits (configurable in LibreNMS global settings or config.php) to prevent abuse.

The API is a powerful tool for advanced users and developers looking to integrate LibreNMS into broader workflows.

Workshop Adding a Custom Application Monitor (Simple Example)

Objective: To create a very simple custom application monitor that checks if a specific TCP port is open on a server. This will involve a local agent script, SNMP extend, and a basic application definition in LibreNMS.

Prerequisites:

  • LibreNMS installed and running.
  • A Linux server to monitor (can be the LibreNMS server itself or another VM). Let's call it target-server.
  • SSH access to target-server.
  • net-tools or ss installed on target-server (for netstat or ss command).
  • We'll check if, for example, TCP port 80 (HTTP) is listening on target-server.

Tasks:

Part 1: Create and Deploy the Agent Script on target-server

  1. Create the script:

    • Log in to target-server.
    • Create a script, for example, /usr/local/bin/check_port_80.sh:
      sudo vim /usr/local/bin/check_port_80.sh
      
    • Add the following content:
      #!/bin/bash
      # Script to check if TCP port 80 is listening
      
      # Use netstat or ss. ss is generally preferred if available.
      if command -v ss &> /dev/null; then
          # ss -tlpn | grep -q ':80 ' # -q for quiet, exit status is enough
          # For more robust check, ensure it's LISTEN state
          ss -Hltn sport = 80 | grep -q LISTEN
      elif command -v netstat &> /dev/null; then
          netstat -tlpn | grep -q ':80 ' # Adjust grep if needed for your netstat output
      else
          echo "port_80_status:-1" # Indicate error if no tool found
          exit 1
      fi
      
      if [ $? -eq 0 ]; then
          echo "port_80_status:1" # Port is listening
      else
          echo "port_80_status:0" # Port is not listening
      fi
      exit 0
      
    • Make the script executable:
      sudo chmod +x /usr/local/bin/check_port_80.sh
      
    • Test the script:
      • If a web server IS listening on port 80: sudo /usr/local/bin/check_port_80.sh should output port_80_status:1.
      • If NOT: it should output port_80_status:0.
      • You can temporarily start a simple listener for testing: sudo python3 -m http.server 80 (and stop with Ctrl+C after test).
  2. Configure snmpd on target-server to expose this script:

    • Edit /etc/snmp/snmpd.conf:
      sudo vim /etc/snmp/snmpd.conf
      
    • Add an extend line (choose an unused OID branch, e.g., under .1.3.6.1.4.1.2021.7890.x or a custom enterprise OID if you have one):
      # Custom check for TCP Port 80 status
      extend port80mon .1.3.6.1.4.1.8072.1.3.2.3.1.1.8.port80mon /usr/local/bin/check_port_80.sh
      # The OID .1.3.6.1.4.1.8072.1.3.2.3.1.1.8 is part of NET-SNMP-EXTEND-MIB::nsExtendOutput1Line."port80mon"
      # The final part "port80mon" is the token name for this extend.
      
      Note: A simpler way just using a name, letting snmpd assign an OID automatically under nsExtendObjects (.1.3.6.1.4.1.8072.1.3.2):
      extend port80_check /usr/local/bin/check_port_80.sh
      
      Let's use this simpler named extend.
    • Restart snmpd:
      sudo systemctl restart snmpd
      
    • Test SNMP query from LibreNMS server (or locally on target-server if snmpwalk is installed there): Replace YOUR_COMMUNITY and TARGET_SERVER_IP.
      snmpwalk -v2c -c YOUR_COMMUNITY TARGET_SERVER_IP NET-SNMP-EXTEND-MIB::nsExtendOutput1Line
      # You should see output similar to:
      # NET-SNMP-EXTEND-MIB::nsExtendOutput1Line."port80_check" = STRING: port_80_status:1
      # Or if you used a specific OID, walk that OID.
      
      Note the exact OID that returns the STRING value. For extend port80_check ..., the OID for the output line would be something like .1.3.6.1.4.1.8072.1.3.2.3.1.1.12.port80_check.1 (the number 12 might vary, it's the length of port80_check). It's easier to discover this via snmpwalk -v2c -c YOUR_COMMUNITY TARGET_SERVER_IP nsExtendOutput1Line and find the port80_check entry.

Part 2: Configure LibreNMS to Poll and Graph This Metric

  1. Create an Application Definition YAML:

    • On your LibreNMS server, navigate to /opt/librenms/LibreNMS/ rýchlo /Applications/ (create directory if it doesn't exist, but it should be /opt/librenms/LibreNMS/OS/Applications/ or similar based on recent LibreNMS structure, check includes/definitions/applications/ in your version). For this example, let's assume new app definitions go into includes/definitions/applications/.
    • Create a YAML file, e.g., custom_port_check.yaml:
      sudo vim /opt/librenms/includes/definitions/applications/custom_port_check.yaml
      
    • Add the following content. You'll need to replace YOUR_OID_FOR_PORT80_CHECK_OUTPUT with the actual OID you found from snmpwalk that returns the string port_80_status:X. For extend port80_check ..., the OID for the output string is NET-SNMP-EXTEND-MIB::nsExtendOutputFull."port80_check". You can use names if MIBs are loaded, or numerical OIDs. Let's use a simplified example assuming output is numerical (0 or 1 directly from script). If the script outputs port_80_status:1, then LibreNMS needs to parse this. The collectd_format: true is for scripts formatted like collectd exec scripts. Our script is simpler. We need to adjust the script to output just the value, or LibreNMS needs to parse "port_80_status:X". LibreNMS application pollers often expect direct numerical output or specific formats.

      Let's simplify the agent script output for this example to just 0 or 1:

      #!/bin/bash
      # Script to check if TCP port 80 is listening - outputs 0 or 1
      if command -v ss &> /dev/null; then
          ss -Hltn sport = 80 | grep -q LISTEN
      elif command -v netstat &> /dev/null; then
          netstat -tlpn | grep -q ':80 '
      else
          echo "-1" # Error
          exit 1
      fi
      
      if [ $? -eq 0 ]; then
          echo "1" # Port is listening
      else
          echo "0" # Port is not listening
      fi
      exit 0
      
      Update the script on target-server and re-test snmpwalk for the output. It should now be just STRING: "1" or STRING: "0".

      Now, the YAML (/opt/librenms/includes/definitions/applications/custom_port_check.yaml):

      app: custom_port_check # Unique identifier for this app poller
      name: "Custom TCP Port 80 Check" # Human-readable name
      version_oid: # Optional: OID to get app version, not needed here
      data:
          - { graph: port_80_status, # Name of the RRD file and graph definition
              oid: NET-SNMP-EXTEND-MIB::nsExtendOutputFull."port80_check", # OID for the script output
              # If using numerical OID: e.g., .1.3.6.1.4.1.8072.1.3.2.4.1.2.12.port80_check.1 (replace 12 with actual length of "port80_check")
              # This OID should return the raw value (0 or 1) as a string.
              # LibreNMS will convert string "0" or "1" to a number.
              ds_name: status, # Data source name within the RRD
              type: GAUGE, # Data source type
              descr: "Port 80 Listening Status (1=Listening, 0=Not Listening)" }
      graphs: # Define how to graph this data
          - port_80_status # Matches 'graph' name above
      
      Note: The OID for nsExtendOutputFull provides the full output (possibly multi-line). nsExtendOutput1Line provides only the first line. Since our script is single line, either should work. The MIB name NET-SNMP-EXTEND-MIB::nsExtendOutputFull."port80_check" is generally preferred if your snmpd and LibreNMS can resolve it. If not, use the numeric OID.

  2. Enable the Application Monitor in LibreNMS UI:

    • Go to the device page for target-server in LibreNMS.
    • Click Edit (Cog Icon), then go to the Applications tab.
    • You should see "Custom TCP Port 80 Check" in the list of available applications. Enable it.
    • Click Save Changes.
  3. Wait for Polling and Check Data:

    • LibreNMS will poll this new application data during its next regular polling cycle for the device.
    • After 5-10 minutes, go to the target-server device page in LibreNMS.
    • Click on the Applications tab. You should see an entry for "Custom TCP Port 80 Check."
    • Click on it to see the graph for "Port 80 Listening Status." It should show a line at 1 (if listening) or 0 (if not).
    • You can now create an alert rule based on this application metric (e.g., if applications.app_custom_port_check_port_80_status.status equals 0).

Troubleshooting:

  • If data doesn't appear, use poller debug for the application:
    # On LibreNMS server
    cd /opt/librenms
    sudo -u librenms ./poller.php -h TARGET_SERVER_HOSTNAME -d -m applications
    
    Look for output related to custom_port_check and any errors.
  • Verify OIDs carefully with snmpwalk.
  • Ensure the agent script on target-server is working correctly and snmpd is restarted after config changes.
  • Check /opt/librenms/logs/librenms.log for errors.

Deliverables/Reflection:

  • A working agent script on a target server that reports a specific status (port listening).
  • snmpd configured with an extend directive to expose the script's output.
  • A custom application YAML definition in LibreNMS.
  • The custom application monitor enabled for the target device in LibreNMS UI.
  • A graph appearing in LibreNMS showing the status reported by your custom script.

This workshop, while simplified, demonstrates the fundamental process of extending LibreNMS to monitor custom metrics via SNMP extend scripts and application pollers. This pattern can be adapted for many types of custom checks.

10. Performance Tuning and Scaling

As the number of monitored devices and services grows, LibreNMS performance can become a concern. Optimizing various components and potentially scaling out with distributed pollers are key to maintaining a responsive and reliable monitoring system.

Optimizing Database Performance (MySQL/MariaDB tuning)

The database is a critical component. Slow database queries can impact the web UI, poller performance, and alert processing.

  • Hardware:
    • SSDs: Use SSDs for your database storage. This is one of the most significant improvements.
    • RAM: Ensure sufficient RAM for the database server. MariaDB/MySQL use RAM for caching (e.g., InnoDB buffer pool). More RAM means more data can be served from cache, reducing disk I/O.
  • MariaDB/MySQL Configuration (my.cnf or 50-server.cnf):
    • innodb_buffer_pool_size: This is the most important setting for InnoDB tables (which LibreNMS uses). It defines the size of the memory cache for table data and indexes.
      • A common recommendation is to set this to 50-70% of available system RAM if the database server is dedicated. If it shares resources (like on the LibreNMS server itself), be more conservative.
      • Example: On a server with 8GB RAM dedicated to DB, innodb_buffer_pool_size = 4G or 6G might be appropriate.
      • Monitor buffer pool hit rate to see if it's effective.
    • innodb_log_file_size: Size of the redo logs. Larger logs can improve write performance but increase recovery time. A common starting point is 256M or 512M. If changing, you need to stop mysqld, remove old log files, and restart.
    • innodb_flush_log_at_trx_commit: Controls durability vs. performance.
      • 1 (default): Fully ACID compliant, flushes log to disk at each transaction commit (safest, but can be slower).
      • 2: Flushes log to OS cache at commit, flushes to disk once per second (good balance, much faster writes, small risk of 1-second data loss on OS crash). Often recommended for LibreNMS if minor data loss on crash is acceptable for performance.
      • 0: Flushes log to disk once per second (fastest, higher risk of data loss on crash).
    • query_cache_size / query_cache_type: The query cache is generally deprecated in modern MySQL/MariaDB versions (often disabled by default or removed) as it can cause contention issues. For most LibreNMS workloads, it's better left disabled or set to a very small size if enabled.
    • max_connections: Maximum number of concurrent client connections. Default is often 151. LibreNMS pollers and web UI can open multiple connections. Monitor Threads_connected and Max_used_connections status variables. Increase if you're hitting the limit, but ensure the server has resources.
    • tmp_table_size and max_heap_table_size: Affect temporary tables created for complex queries. If you see many on-disk temporary tables, increasing these (within RAM limits) can help.
    • join_buffer_size, sort_buffer_size, read_rnd_buffer_size: Per-session buffers. Increasing them globally can consume a lot of RAM. Adjust cautiously, or consider setting them per-session for problematic queries if identified.
  • Tools for Tuning:
    • MySQLTuner-perl: A script that analyzes your database server's configuration and status variables and provides recommendations. Run it after the server has been active for at least 24-48 hours under normal load.
      wget https://raw.githubusercontent.com/major/MySQLTuner-perl/master/mysqltuner.pl
      perl mysqltuner.pl
      
    • Percona Toolkit: Includes tools like pt-query-digest for analyzing slow query logs.
  • Slow Query Log: Enable the slow query log in MariaDB/MySQL to identify queries that are taking too long.
    # In my.cnf
    slow_query_log = 1
    slow_query_log_file = /var/log/mysql/mysql-slow.log
    long_query_time = 2 # Log queries taking longer than 2 seconds
    # log_queries_not_using_indexes = 1 # Optional, logs queries not using indexes
    
    Analyze this log to find bottlenecks. Sometimes, adding an index or rewriting a query (if it's from custom code) can help. For LibreNMS core queries, ensure your schema is up-to-date (./scripts/database-schema.sh).
  • Regular Maintenance:
    • Run OPTIMIZE TABLE on frequently updated tables periodically (e.g., events, syslog). This can reclaim space and reduce fragmentation. LibreNMS daily.sh might handle some of this.
    • Ensure your LibreNMS database schema is up-to-date via ./validate.php or ./scripts/database-schema.sh.

Distributed Polling

When monitoring a large number of devices (hundreds or thousands) or devices across different geographical locations or isolated network segments, a single LibreNMS poller can become overwhelmed. Distributed polling allows you to scale out the polling load.

  • Concept:
    • One central LibreNMS web server and database.
    • Multiple poller instances running on different servers (these are the "distributed pollers").
    • Pollers are organized into "poller groups."
    • Devices are assigned to a specific poller group. The pollers in that group are then responsible for polling those devices.
    • Distributed pollers communicate with the main LibreNMS database to get their list of devices and write back polling data (RRD files are typically still written by the distributed pollers locally and then synced, or rrdcached on the main server is used by all).
  • Components:
    • Main LibreNMS Server: Hosts the web UI, database, and central configuration. It may or may not do polling itself.
    • Poller Servers: Lightweight servers (can be VMs) running the LibreNMS poller code (poller-wrapper.py or poller.php). They need network connectivity to the devices they poll AND to the central LibreNMS database and rrdcached (if used centrally).
    • rrdcached: Can be run on the main LibreNMS server and accessed by all pollers, or each poller group can have its own rrdcached that syncs RRDs back to a central store. Central rrdcached is common.
    • File Synchronization (e.g., rsync): RRD files generated by distributed pollers need to be available to the web UI on the main server for graphing. rsync is commonly used to synchronize RRDs from pollers to the main server if rrdcached isn't handling all writes centrally.
  • Setup Steps (High-Level):

    1. Prepare Poller Servers: Install OS, basic dependencies (PHP, snmp tools, python for poller-wrapper.py). No full web server or DB needed on these.
    2. Install LibreNMS Code: Clone the LibreNMS git repository onto each poller server (same version as the main server).
    3. Configure config.php on Pollers: Point them to the central database and central rrdcached.
    4. Poller Groups: In LibreNMS UI (Global Settings > Polling > Distributed Pollers or Gear Icon > Pollers > Poller Groups):
      • Define poller groups (e.g., "US-East-Pollers," "Datacenter1-Pollers"). Default is group 0.
      • Each poller server needs to be configured with a unique poller ID (hostname) and assigned to a poller group in its config.php or via environment variables.
        // In config.php on a poller
        $config['distributed_poller'] = true;
        $config['poller_name'] = gethostname(); // Or a unique name
        $config['poller_group'] = 1; // Assign to poller group 1
        $config['rrdcached']   = 'main_librenms_server_ip:42217'; // Point to central rrdcached
        
    5. Assign Devices to Poller Groups: In the device settings (Edit device > Modules/Polling), assign the device to the appropriate poller group.
    6. Set up Cron Jobs on Pollers: Each poller server runs the standard LibreNMS cron jobs (discovery, poller, etc.), but they will only act on devices assigned to their group.
    7. RRD Synchronization: If RRDs are written locally by pollers, set up rsync jobs to copy RRD files from each poller's /opt/librenms/rrd directory to the main server's /opt/librenms/rrd directory. This needs to be done carefully to avoid conflicts (e.g., pollers for different groups should write to distinct subdirectories if syncing to a common RRD path, or ensure devices are strictly partitioned by group). The most robust method is often using a central rrdcached instance that all pollers write to.
    8. Time Synchronization: Crucial. All poller servers and the main server must have their time synchronized via NTP.
  • Benefits:

    • Improved poller performance and reduced polling cycle times.
    • Ability to monitor devices in isolated networks (poller placed within that network).
    • Increased redundancy (if one poller in a group fails, others can potentially take over if configured for HA, though this is more complex).

Distributed polling adds complexity but is essential for large-scale deployments.

Web Server Optimization (Nginx/Apache)

The web server serving the LibreNMS UI can also be a bottleneck, especially with many concurrent users or frequent API calls.

  • Nginx (Recommended):

    • PHP-FPM Tuning: The number of PHP-FPM child processes and their configuration is critical.
      • In /etc/php/VERSION/fpm/pool.d/www.conf (or a dedicated librenms.conf pool):
        • pm: Process manager. dynamic or ondemand are common. static can be used if you know exact needs.
        • pm.max_children: Max number of concurrent PHP requests. Depends on RAM per child and total available RAM. If too low, users see delays/errors. If too high, server can run out of memory.
        • pm.start_servers, pm.min_spare_servers, pm.max_spare_servers (for dynamic PM).
        • pm.process_idle_timeout (for ondemand PM).
        • listen: Ensure it matches the fastcgi_pass directive in your Nginx config.
        • Ensure PHP-FPM runs as the librenms user for correct file permissions.
    • Nginx Worker Processes:
      • In /etc/nginx/nginx.conf:
        • worker_processes: Typically set to the number of CPU cores, or auto.
        • worker_connections: Max connections per worker.
    • Caching:
      • Enable browser caching for static assets (JS, CSS, images) using expires headers in Nginx.
      • Consider fastcgi_cache for caching PHP responses for frequently accessed, non-dynamic pages (use with caution for a dynamic UI like LibreNMS, might be better for API).
    • Keepalive Connections: Enable HTTP keepalive to reduce connection overhead.
    • Gzip Compression: Enable gzip for text-based content (HTML, CSS, JS, JSON) to reduce bandwidth.
  • Apache:

    • MPM Module: Choose the right Multi-Processing Module (MPM):
      • mpm_event (default in newer Apache): Good for high concurrency, uses threads.
      • mpm_worker: Also threaded, older than event.
      • mpm_prefork: Uses processes, less memory efficient but sometimes considered more stable for non-thread-safe PHP modules (though mod_php with prefork is less common now than PHP-FPM). When using PHP-FPM with Apache (via mod_proxy_fcgi), Apache's MPM choice is less critical for PHP performance itself, but mpm_event is still generally preferred for Apache's own efficiency.
    • PHP Handler: Use PHP-FPM with mod_proxy_fcgi for better performance and flexibility than mod_php.
    • MaxRequestWorkers / ServerLimit (for event/worker MPMs): Similar to pm.max_children for PHP-FPM, controls concurrent requests Apache can handle.
    • KeepAlive, Gzip, Caching: Similar principles as Nginx.
  • General Web Server Tips:

    • Use HTTPS (SSL/TLS) for security. Modern CPUs have AES acceleration, so performance impact is often minimal. HTTP/2 can also improve performance.
    • Monitor web server logs and PHP-FPM logs for errors or performance issues.

Managing rrdcached

As mentioned, rrdcached is crucial for RRD I/O performance.

  • Configuration:
    • Usually configured in /etc/default/rrdcached (Debian/Ubuntu) or via systemd unit.
    • Key options:
      • OPTS: Command-line options for rrdcached.
        • -l unix:/var/run/rrdcached.sock: Listen on a Unix socket (common). Or -l IP:PORT to listen on a TCP socket (needed if pollers are on different hosts).
        • -w <timeout>: Write timeout (e.g., 1800s). How long data can stay in cache before being flushed.
        • -f <timeout>: Flush timeout (e.g., 3600s). How long before rrdcached forces a flush of all pending writes for an RRD file if it hasn't been updated.
        • -p <pidfile>: Path to PID file.
        • -j <journal_dir>: Directory for RRD journal files (for recovery if rrdcached crashes).
        • -B: Run in background.
        • -R: Allow recursive directory creation (check permissions).
        • -t <num_threads>: Number of write threads.
    • Ensure the Unix socket or TCP port is accessible by LibreNMS pollers (and the web UI if it also writes/reads through rrdcached).
    • Permissions: The rrdcached process (and its journal directory) must have write access to the RRD files/directory (/opt/librenms/rrd). The librenms user typically owns RRDs. rrdcached might run as its own user or the librenms user.
  • LibreNMS config.php:
    $config['rrdcached']    = "unix:/var/run/rrdcached.sock"; // Or "main_server_ip:42217" for TCP
    
  • Monitoring rrdcached:
    • Use rrdtool C<sock_path> stats to get stats from rrdcached (queue length, number of flushes, etc.).
    • Monitor its log files if configured.
  • Sizing Cache: rrdcached primarily caches writes. The actual "cache" size isn't configured like a database buffer pool; it's more about managing write queues and journal files. Ensure sufficient disk space for the journal if enabled.

Workshop Setting up a Distributed Poller (Conceptual or Simplified)

Objective:
To understand the configuration steps for a distributed poller. A full multi-VM setup can be complex for a workshop. We'll outline the key steps and, if possible, simulate a second "poller" on the same machine for conceptual understanding (not a production setup).

Scenario A: Full Conceptual Outline (Ideal but resource-intensive for a workshop)

  1. VM1: Main LibreNMS Server:

    • Already installed and running LibreNMS, database, web UI.
    • rrdcached running and configured to listen on a TCP socket (e.g., 0.0.0.0:42217).
      # /etc/default/rrdcached on main server
      OPTS="-w 1800 -f 3600 -p /var/run/rrdcached.pid -j /var/lib/rrdcached/journal/ -l 0.0.0.0:42217 -R -B"
      # Ensure firewall allows connections to port 42217 from poller VMs
      
    • In UI: Gear Icon > Pollers > Settings: Ensure poller-wrapper.py is selected if you intend to use it.
    • In UI: Gear Icon > Pollers > Poller Groups: Create a new group, e.g., Group ID 1, Name RemoteSitePollers.
  2. VM2: Distributed Poller Server:

    • Fresh OS (e.g., Ubuntu Server).
    • Install dependencies: git, python3 (for poller-wrapper.py), php-cli, php-snmp, snmp, fping, etc. (core poller dependencies, no web server or DB needed).
    • Clone LibreNMS: sudo git clone https://github.com/librenms/librenms.git /opt/librenms
    • Create librenms user and set permissions on /opt/librenms.
    • Create /opt/librenms/config.php with:
      <?php
      // Database config - point to VM1's database
      $config['db_host'] = 'IP_OF_VM1_MAIN_LIBRENMS';
      $config['db_user'] = 'librenms';
      $config['db_pass'] = 'your_db_password';
      $config['db_name'] = 'librenms';
      
      $config['user'] = 'librenms'; // User LibreNMS runs as
      
      // Distributed Poller Config
      $config['distributed_poller'] = true;
      $config['poller_name'] = 'poller-vm2'; // Unique name for this poller
      $config['poller_group'] = 1;          // Assign to group 1 created on main server
      $config['rrdcached']   = 'IP_OF_VM1_MAIN_LIBRENMS:42217'; // Point to central rrdcached
      
      // Ensure this poller doesn't try to run web UI related tasks
      $config['web_dir'] = null;
      $config['install_dir'] = '/opt/librenms';
      
      // Add any other necessary base configs usually found in config.php
      $config['snmp']['community'] = array("public", "your_other_communities");
      // ...
      
    • Copy cron job: sudo cp /opt/librenms/librenms.cron /etc/cron.d/librenms (or the poller-wrapper.py cron setup).
    • RRD directory: The RRDs will be written via rrdcached to VM1. Ensure /opt/librenms/rrd exists on VM2, but it might not be heavily used if all writes go through central rrdcached.
    • Ensure VM2 can reach VM1's database (port 3306) and rrdcached (port 42217). Adjust firewalls.
    • Database user librenms on VM1 must be configured to allow connections from VM2's IP (e.g., GRANT ALL ON librenms.* TO 'librenms'@'IP_OF_VM2' IDENTIFIED BY 'your_db_password';).
  3. On Main LibreNMS Server (VM1):

    • Assign a device (or several) to Poller Group 1 via the device's Edit page.
    • In UI: Gear Icon > Pollers > Pollers: After VM2's poller service starts and communicates, poller-vm2 should appear in the list, associated with group 1.
  4. Verify:

    • Monitor poller logs on VM2 (/opt/librenms/logs/librenms.log).
    • Check poller_perf in the database or UI to see if poller-vm2 is polling its assigned devices.
    • Graphs for devices in group 1 should update.

Scenario B: Simplified Conceptual Simulation (Single Machine - NOT for Production)
This is to understand file structure and config, but poller-wrapper.py handles multi-poller on one host better. This is more for a very basic illustration of separate config for a "different" poller.

  1. On your existing LibreNMS server:
    • In UI: Create Poller Group 1.
    • Assign one of your test devices to Poller Group 1.
  2. Simulate a second poller instance configuration:

    • Make a copy of your /opt/librenms directory (e.g., /opt/librenms_poller2). This is messy and not recommended for real use.

      # sudo cp -a /opt/librenms /opt/librenms_poller2 # DANGEROUS if not careful
      
      A better way is to use the same codebase but a different config file for a separate poller process. poller-wrapper.py is designed for this. The cron job for poller-wrapper.py can spawn multiple poller threads/processes based on CPU cores and configuration. Let's focus on the poller-wrapper.py approach which is standard.

    • Using poller-wrapper.py and its native multi-poller capabilities:

      • LibreNMS poller-wrapper.py script (run by cron) can manage multiple poller processes/threads on a single machine.
      • Edit /opt/librenms/config.php to define poller group for the current instance if it's not default 0, or rely on poller_id in pollers table.
      • The poller-wrapper.py itself handles running pollers for devices assigned to groups that this poller instance is part of.
      • To truly have a "distributed" poller, it needs to be on a separate host.
    • Conceptual Configuration for a Separate Poller (if it were on another machine): If you had another machine, you'd set its /opt/librenms/config.php to point to the central DB and rrdcached, and assign it a unique poller_name and poller_group.

      // On hypothetical separate poller machine's config.php
      $config['db_host'] = 'IP_OF_MAIN_LIBRENMS'; // ... other DB settings
      $config['poller_name'] = 'secondary-poller-host';
      $config['poller_group'] = 1; // Or another group number
      $config['rrdcached'] = 'IP_OF_MAIN_LIBRENMS:42217';
      
      Then its cron job would run pollers for devices in group 1.

Key Takeaway for the Workshop:
The core is understanding that:

  1. Distributed pollers are separate LibreNMS poller instances (code + cron).
  2. They connect to a central database and often a central rrdcached.
  3. They are assigned a poller_group in their config.php.
  4. Devices in the UI are assigned to these poller groups.
  5. Firewall rules must allow communication between pollers and the central services.

For practical experience without multiple VMs:

  • Ensure poller-wrapper.py is used by your cron job (this is default in newer installs).
    # /etc/cron.d/librenms
    # */5  *    * * * librenms /opt/librenms/poller-wrapper.py 16 >> /dev/null 2>&1
    # The '16' is an example for number of threads.
    
  • Go to Gear Icon > Pollers > Poller Groups and create a group (e.g., group 1).
  • Edit one of your devices and assign it to this new poller group 1.
  • Go to Gear Icon > Pollers > Pollers. Your main poller (e.g., your server's hostname) will be listed, likely associated with several groups including the default (e.g., -1 or 0) and any new ones you add devices to if no specific poller is configured for that group.
  • If you were to add another poller server, it would register itself with its name and group, and then only pollers in that specific group would handle devices assigned to it.

This conceptual understanding is more feasible for a typical workshop environment than a full multi-VM setup unless dedicated lab resources are available. The main point is to grasp the configuration and data flow.

11. Security Best Practices

Securing your LibreNMS installation is paramount, as it contains sensitive information about your network infrastructure and credentials to access monitored devices.

Securing the Web Interface (HTTPS, Authentication)

  • HTTPS (SSL/TLS):

    • Always use HTTPS for the LibreNMS web interface to encrypt traffic between users and the server. This protects login credentials and all viewed data.
    • Obtain an SSL Certificate:
      • Let's Encrypt (Recommended for public servers): Free, automated certificates. Use tools like certbot.
        sudo apt install certbot python3-certbot-nginx # For Nginx
        sudo certbot --nginx -d your.librenms.domain.com
        # Certbot will obtain the cert, configure Nginx, and set up auto-renewal.
        
      • Commercial Certificates: Purchase from a Certificate Authority (CA).
      • Self-Signed Certificates: Can be used for internal-only access, but browsers will show warnings. Not recommended if users access from outside a trusted zone.
    • Configure Web Server for HTTPS:
      • Nginx: certbot usually handles this. Manually, you'd modify your Nginx virtual host:
        listen 443 ssl http2;
        listen [::]:443 ssl http2;
        ssl_certificate /etc/letsencrypt/live/your.librenms.domain.com/fullchain.pem;
        ssl_certificate_key /etc/letsencrypt/live/your.librenms.domain.com/privkey.pem;
        # Add other SSL hardening options (protocols, ciphers, HSTS)
        ssl_protocols TLSv1.2 TLSv1.3;
        ssl_prefer_server_ciphers off; # Or on with a strong cipher list
        ssl_ciphers 'ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384';
        add_header Strict-Transport-Security "max-age=63072000; includeSubDomains; preload";
        
      • Apache: Similar configuration using SSLEngine on and specifying certificate paths.
    • HTTP to HTTPS Redirection: Configure your web server to automatically redirect all HTTP requests to HTTPS.
  • Strong Authentication:

    • User Passwords: Enforce strong, unique passwords for all LibreNMS user accounts.
    • Two-Factor Authentication (2FA): Highly recommended. LibreNMS supports 2FA (e.g., TOTP with Google Authenticator). Users can enable it in their profile settings. Admins can enforce it globally via config.php:
      $config['auth_2fa_required'] = true; // Require for all users except those in $config['auth_2fa_exempt']
      
    • Centralized Authentication (LDAP/RADIUS/SAML):
      • Integrate with existing identity providers like Active Directory (LDAP), FreeIPA (LDAP), or SAML providers.
      • This centralizes user management, password policies, and can enforce organizational security standards.
      • Configuration is done in Global Settings > Authentication.
  • Access Control:
    • Use LibreNMS user roles (Administrator, Normal User, Global Read) appropriately.
    • Limit administrator access to only those who absolutely need it.
    • Use device groups to restrict what normal users can see and manage.
  • Web Application Firewall (WAF):
    • Consider placing a WAF (like ModSecurity with Nginx/Apache, or a cloud WAF) in front of LibreNMS to protect against common web attacks (SQL injection, XSS), although LibreNMS itself is generally well-written to prevent these.

Hardening the Operating System

The underlying OS of your LibreNMS server (and any distributed pollers) must be secured.

  • Minimize Attack Surface:
    • Install only necessary packages. Start with a minimal server install.
    • Disable or remove unused services and daemons.
    • Close unneeded network ports using a firewall.
  • Firewall:
    • Use a host-based firewall (e.g., ufw on Ubuntu, firewalld on CentOS/RHEL).
    • Allow only essential inbound traffic:
      • SSH (TCP 22) - preferably restricted to trusted IPs.
      • HTTP (TCP 80) - if used for Let's Encrypt validation or redirecting to HTTPS.
      • HTTPS (TCP 443) - for the web UI.
      • SNMP (UDP 161) - if monitoring the LibreNMS server itself.
      • SNMP Traps (UDP 162) - if LibreNMS is configured to receive traps.
      • Database port (TCP 3306) - if accessed by remote pollers.
      • rrdcached port - if accessed by remote pollers.
      • Syslog port (UDP/TCP 514) - if receiving syslog.
  • Regular Updates:
    • Keep the OS and all installed packages up-to-date with security patches.
    • sudo apt update && sudo apt upgrade -y (Ubuntu/Debian)
    • Configure automatic updates for security patches if appropriate for your policy.
  • Secure SSH:
    • Disable root login: PermitRootLogin no in /etc/ssh/sshd_config.
    • Use key-based authentication instead of passwords.
    • Change the default SSH port (security by obscurity, but can reduce automated scans).
    • Use Fail2Ban or sshguard to block IPs that attempt brute-force SSH logins.
  • User Accounts:
    • Use non-root users with sudo for administration.
    • Ensure the librenms user has minimal necessary privileges and a non-login shell if it doesn't need to log in directly (sudo usermod -s /usr/sbin/nologin librenms). (Note: some scripts might require a shell, so test carefully. /bin/bash is often set for librenms user).
  • Intrusion Detection/Prevention (IDS/IPS):
    • Consider host-based IDS like AIDE (Advanced Intrusion Detection Environment) for file integrity monitoring or OSSEC/Wazuh.
  • Logging and Auditing:
    • Ensure system logs (/var/log/syslog, /var/log/auth.log) are regularly monitored or forwarded to a central SIEM.
    • Enable auditd for more detailed system call auditing if needed.

Database Security

  • Strong Passwords: Use strong, unique passwords for the MariaDB/MySQL root user and the librenms database user.
  • Network Access:
    • By default, MariaDB/MySQL often binds to 127.0.0.1 (bind-address = 127.0.0.1 in my.cnf). If your database is on the same server as LibreNMS and no remote pollers need access, this is the most secure.
    • If remote pollers need database access, bind to the specific internal IP address and use firewall rules to restrict access to only the IPs of your LibreNMS server and poller machines.
  • User Privileges:
    • The librenms database user should only have privileges on the librenms database itself (GRANT ALL PRIVILEGES ON librenms.* TO ...). Avoid granting global privileges.
  • Encryption:
    • Consider enabling SSL/TLS for connections to the database server if it's accessed over an untrusted network (e.g., between remote pollers and the central DB).
    • Explore MariaDB/MySQL Transparent Data Encryption (TDE) for data-at-rest encryption if required by compliance.
  • Regular Backups: Essential (covered below).
  • Remove Test Database and Anonymous Users: mysql_secure_installation usually handles this.

SNMPv3 Configuration for Enhanced Security

SNMPv1 and v2c rely on community strings (plain text passwords), which are insecure. SNMPv3 provides authentication and encryption.

  • Benefits of SNMPv3:
    • Authentication: Verifies the identity of the NMS and the agent.
    • Encryption (Privacy): Protects data in transit from eavesdropping.
    • Message Integrity: Ensures messages are not tampered with.
  • Configuring SNMPv3 on a Managed Device (e.g., Linux snmpd):
    1. Stop snmpd service.
    2. Create SNMPv3 User: Use net-snmp-create-v3-user utility or manually edit snmpd.conf and /var/lib/snmp/snmpd.conf (or /var/net-snmp/snmpd.conf).
      • Example using net-snmp-create-v3-user (this command might not be available on all systems or might be part of a sub-package):
        # This command often modifies /var/lib/snmp/snmpd.conf directly
        # sudo net-snmp-create-v3-user -ro -A YourAuthPassword -X YourPrivPassword -a SHA -x AES snmpv3user
        # -ro: read-only user
        # -A: Authentication password
        # -X: Privacy (encryption) password
        # -a: Authentication protocol (SHA or MD5)
        # -x: Privacy protocol (AES or DES)
        # snmpv3user: the username
        
      • Manual snmpd.conf configuration: In /etc/snmp/snmpd.conf:
        # First, if you don't have a master agentx socket, you may need this line
        # master agentx
        
        # Create a read-only user with authentication (SHA) and privacy (AES)
        # Replace YourAuthPassword and YourPrivPassword with strong, unique passwords
        # Minimum password length is 8 characters for net-snmp.
        createUser snmpv3user SHA "YourAuthPassword" AES "YourPrivPassword"
        
        # Grant read-only access to this user for the entire MIB tree
        rouser snmpv3user priv .1
        # 'priv' means both authentication and privacy are required.
        # 'auth' would mean only authentication is required (no encryption).
        # '.1' gives access to the entire MIB tree starting from .iso(1)
        
        Note: The createUser directive in /etc/snmp/snmpd.conf is processed once at startup to populate a persistent configuration, usually in /var/lib/snmp/snmpd.conf or /var/net-snmp/snmpd.conf. After the first start, snmpd reads user configurations from that persistent file. You might need to remove the createUser line from /etc/snmp/snmpd.conf after the user is created to avoid errors on subsequent restarts, or manage users directly in the persistent file.
    3. Start snmpd service.
    4. Test SNMPv3 from LibreNMS server:
      snmpwalk -v3 -l authPriv -u snmpv3user -a SHA -A YourAuthPassword -x AES -X YourPrivPassword TARGET_DEVICE_IP system
      # -l authPriv: Security level (authentication and privacy)
      # -u snmpv3user: Username
      # -a SHA: Authentication protocol
      # -A YourAuthPassword: Authentication password
      # -x AES: Privacy protocol
      # -X YourPrivPassword: Privacy password
      
  • Adding SNMPv3 Device in LibreNMS:
    • When adding or editing a device in LibreNMS UI:
      • Select SNMP Version: v3.
      • Auth Level: Choose authPriv, authNoPriv, or noAuthNoPriv.
      • Auth Username: snmpv3user.
      • Auth Algo: SHA (or MD5).
      • Auth Password: YourAuthPassword.
      • Crypto Algo: AES (or DES).
      • Crypto Password: YourPrivPassword.

Transitioning to SNMPv3 significantly improves the security of your monitoring data. It's more complex to set up but highly recommended.

Regular Backups and Disaster Recovery

Data loss can be catastrophic. Regular backups of your LibreNMS server are essential.

  • What to Back Up:

    1. LibreNMS Database: This contains device configurations, alert rules, user accounts, event history, etc.
      • Use mysqldump to create a logical backup:
        mysqldump -u librenms -pYour_DB_Password librenms | gzip > /backup_path/librenms_db_backup_$(date +%Y%m%d_%H%M%S).sql.gz
        
        (Ensure the backup user has necessary privileges like SELECT, LOCK TABLES, SHOW VIEW, EVENT, TRIGGER)
    2. RRD Files: These store all your historical graph data.
      • Located in /opt/librenms/rrd/.
      • Can be backed up using rsync, tar, or filesystem snapshots. RRD files can be numerous and take up space.
        sudo rsync -avz /opt/librenms/rrd/ /backup_path/rrd_backup/
        # Or tar:
        # sudo tar -czvf /backup_path/librenms_rrd_backup_$(date +%Y%m%d).tar.gz /opt/librenms/rrd
        
        Note: Backing up live RRD files can sometimes lead to slightly inconsistent files if writes occur during backup. Stopping rrdcached and pollers during RRD backup is safest but causes a monitoring gap. Alternatively, if using LVM, take an LVM snapshot and back up from the snapshot.
    3. LibreNMS Configuration Files:
      • /opt/librenms/.env: Contains database credentials and other critical settings.
      • /opt/librenms/config.php: Your main custom configuration.
      • Custom OS definitions, MIBs, poller/discovery scripts, application monitor definitions if you created them.
      • Web server configuration (Nginx/Apache virtual hosts).
      • PHP configuration.
      • Cron jobs (/etc/cron.d/librenms).
    4. The LibreNMS Application Code (/opt/librenms/): While this can be re-cloned from Git, backing up your specific version along with local modifications (if any, though not recommended for core files) can be useful.
  • Backup Strategy:

    • Frequency:
      • Database: Daily or more frequently for critical systems.
      • RRDs: Daily or weekly (depends on how much historical graph data loss you can tolerate). RRDs are less critical than the DB for immediate operational recovery but vital for historical trends.
      • Configuration files: After every significant change, and regularly (e.g., daily).
    • Retention: Keep multiple backup versions (e.g., daily for a week, weekly for a month, monthly for a year).
    • Location: Store backups on a separate physical server, NAS, or cloud storage (e.g., S3). Follow the 3-2-1 backup rule (3 copies, 2 different media, 1 offsite).
    • Automation: Use cron jobs and scripting to automate backups.
    • Testing: Regularly test your backup restoration process to ensure backups are valid and you know how to recover. This is the most crucial and often overlooked step.
  • Disaster Recovery Plan:

    • Document the steps to rebuild your LibreNMS server from scratch using your backups.
    • Include OS installation, dependency setup, restoring database, RRDs, and configurations.
    • Consider how long the recovery process will take (Recovery Time Objective - RTO) and how much data loss is acceptable (Recovery Point Objective - RPO).

Workshop Implementing HTTPS and SNMPv3

Objective:
To secure the LibreNMS web interface with a Let's Encrypt HTTPS certificate and reconfigure one monitored device (e.g., localhost) to use SNMPv3.

Prerequisites:

  • LibreNMS installed and accessible via HTTP.
  • The LibreNMS server must be publicly accessible on ports 80 and 443 from the internet for Let's Encrypt validation using HTTP-01 or TLS-ALPN-01 challenge (unless using DNS-01 challenge which is more complex). If your server is not public, you can only do the SNMPv3 part or use a self-signed certificate for HTTPS (which will generate browser warnings).
  • A registered domain name pointing to your LibreNMS server's public IP address. (e.g., librenms.yourdomain.com).
  • certbot installed (as shown in theory section).

Part 1: Securing Web Interface with Let's Encrypt HTTPS
(Skip this part if your server is not publicly accessible or you don't have a domain name. You can still learn from the steps.)

  1. Ensure DNS is Set Up:
    • Your domain (e.g., librenms.yourdomain.com) must resolve to the public IP address of your LibreNMS server.
  2. Install Certbot (if not already done):
    • Assuming Nginx:
      sudo apt update
      sudo apt install certbot python3-certbot-nginx -y
      
  3. Stop Nginx Temporarily (Optional, Certbot can often work with it running, but sometimes stopping helps for initial setup if port 80 is heavily used):
    # sudo systemctl stop nginx
    
  4. Obtain and Install Certificate:
    • Replace librenms.yourdomain.com with your actual domain and provide your email.
      sudo certbot --nginx -d librenms.yourdomain.com --agree-tos -m your-email@example.com --no-eff-email
      
    • Certbot will ask if you want to redirect HTTP traffic to HTTPS. Choose option 2 (Redirect).
    • If successful, Certbot will configure Nginx to use the SSL certificate and set up automatic renewal.
  5. Restart Nginx (if you stopped it manually, or Certbot might do it):
    # sudo systemctl start nginx
    # Or check status:
    sudo systemctl status nginx
    
  6. Verify HTTPS:
    • Open your web browser and navigate to https://librenms.yourdomain.com.
    • You should see a padlock icon indicating a secure connection.
    • Try accessing via http://librenms.yourdomain.com – it should automatically redirect to HTTPS.
  7. Check Auto-Renewal:
    • Certbot sets up a cron job or systemd timer for renewal. Test it:
      sudo certbot renew --dry-run
      
    • This should complete without errors.

Part 2: Reconfiguring localhost (LibreNMS Server) for SNMPv3

  1. Configure snmpd on LibreNMS Server for SNMPv3:

    • Log in to your LibreNMS server via SSH.
    • Stop snmpd:
      sudo systemctl stop snmpd
      
    • Edit /etc/snmp/snmpd.conf. Remove or comment out existing rocommunity or rwcommunity lines for localhost if they conflict.
      sudo vim /etc/snmp/snmpd.conf
      
      Add the following (replace passwords with your own strong ones, min 8 chars):
      # master agentx # Uncomment if needed and not already present
      createUser librenms_v3user SHA "Str0ngAuthP@sswOrd" AES "Str0ngPrivP@sswOrd"
      rouser librenms_v3user priv .1
      
      (As noted before, after snmpd starts once and creates the user in its persistent store (e.g., /var/lib/snmp/snmpd.conf), you might remove/comment the createUser line from /etc/snmp/snmpd.conf to prevent errors on subsequent restarts, or simply ignore the startup warnings if they occur. The user will persist.)
    • Start snmpd:
      sudo systemctl start snmpd
      sudo systemctl status snmpd # Check it's running
      
  2. Test SNMPv3 Locally:

    snmpwalk -v3 -l authPriv -u librenms_v3user -a SHA -A "Str0ngAuthP@sswOrd" -x AES -X "Str0ngPrivP@sswOrd" localhost system
    
    You should get SNMP output. If not, troubleshoot snmpd.conf, passwords, or check journalctl -u snmpd for errors.

  3. Update localhost Device in LibreNMS UI for SNMPv3:

    • Navigate to your LibreNMS instance (now via HTTPS if you did Part 1).
    • Go to Devices > All Devices, click on localhost.
    • Click the Edit icon (Cog).
    • Go to the SNMP tab (or section).
    • SNMP Version: Select v3.
    • Auth Level: authPriv.
    • Auth Username: librenms_v3user.
    • Auth Algo: SHA.
    • Auth Password: Str0ngAuthP@sswOrd.
    • Crypto Algo: AES. (Ensure it's AES-128, which is typical. If snmpd uses AES-192 or AES-256, ensure LibreNMS choice matches. Default AES in net-snmp is AES-128).
    • Crypto Password: Str0ngPrivP@sswOrd.
    • SNMP Port / Timeout / Retries / etc.: Leave as default unless changed on snmpd.
    • Click Save Changes (or "Update Device").
  4. Verify Polling:

    • Wait for the next polling cycle (up to 5 minutes).
    • The localhost device page should continue to update with new data.
    • Check the Event Log for localhost. You should see successful SNMPv3 polling events. If you see errors, double-check all SNMPv3 parameters in LibreNMS and on the snmpd configuration. Pay close attention to passwords and Auth/Crypto algorithms.

Deliverables/Reflection:

  • LibreNMS web interface accessible via HTTPS with a valid certificate (if Part 1 was feasible).
  • The localhost device successfully monitored by LibreNMS using SNMPv3.
  • Understanding of the steps to configure SNMPv3 on a Linux host and update device settings in LibreNMS.

This workshop enhances the security of your LibreNMS setup significantly by encrypting web traffic and SNMP communication for at least one device. You can apply the SNMPv3 process to other monitored devices as well.

12. Troubleshooting LibreNMS

Even with careful setup, you may encounter issues with LibreNMS. Knowing how to troubleshoot common problems is a vital skill. This involves understanding logs, using built-in debugging tools, and approaching problems systematically.

Common Installation Issues

  • PHP Version/Extension Mismatches:

    • Symptom: Web installer fails, white screen of death in the browser, errors in web server logs referencing undefined PHP functions or classes.
    • Cause: LibreNMS requires a specific minimum PHP version and a set of PHP extensions. If the installed PHP version is too old, or if critical extensions are missing or not enabled, the application will fail to run.
    • Troubleshooting:
      1. Verify PHP Version: Open a terminal on your LibreNMS server and run php -v. Compare this version with the required version stated in the official LibreNMS installation documentation.
      2. Check Loaded PHP Extensions: Run php -m. This lists all compiled and loaded PHP modules. Cross-reference this list with the required extensions in the LibreNMS documentation (e.g., gd, mysql or mysqli or pdo_mysql, snmp, xml, mbstring, tokenizer, json, curl, zip, bcmath, gmp, intl).
      3. Check Web Server PHP Configuration: Ensure that the PHP version and extensions used by your web server (Nginx with PHP-FPM, or Apache with mod_php or PHP-FPM) are the same as the CLI. Sometimes, different php.ini files are used for CLI and FPM/web.
        • For PHP-FPM, check the FPM pool configuration (e.g., /etc/php/YOUR_VERSION/fpm/pool.d/www.conf or a dedicated librenms.conf) and the main php.ini for FPM (e.g., /etc/php/YOUR_VERSION/fpm/php.ini).
      4. Examine Web Server Error Logs: These are invaluable.
        • Nginx: Typically /var/log/nginx/error.log (and possibly a site-specific error log like /var/log/nginx/librenms.error.log).
        • Apache: Typically /var/log/apache2/error.log or /var/log/httpd/error_log. These logs will often contain specific PHP errors pointing to the missing function or extension.
    • Fix:
      • If PHP version is incorrect, upgrade/downgrade PHP using your OS package manager (you might need to use third-party repositories like ppa:ondrej/php for Ubuntu to get specific versions).
      • If extensions are missing, install them using your package manager (e.g., sudo apt install php-your-version-snmp php-your-version-gd).
      • After installing/enabling extensions or changing PHP versions, restart PHP-FPM (e.g., sudo systemctl restart phpYOUR_VERSION-fpm) and your web server (e.g., sudo systemctl restart nginx).
  • Database Connection Failures:

    • Symptom: Web installer cannot connect to the database; errors during ./validate.php execution; the LibreNMS UI shows "Database connection error" or similar messages; pollers fail with database errors in librenms.log.
    • Cause: Incorrect database credentials in .env, database server not running, firewall blocking the database port (default 3306 for MySQL/MariaDB), incorrect bind-address in the database server configuration preventing connections from the LibreNMS host, or the database user lacks permissions.
    • Troubleshooting:
      1. Verify Database Credentials: Check the /opt/librenms/.env file. Ensure DB_HOST, DB_DATABASE, DB_USERNAME, and DB_PASSWORD are correct.
      2. Check Database Service Status: Ensure MariaDB/MySQL is running: sudo systemctl status mariadb (or mysql). If not running, try to start it: sudo systemctl start mariadb. Check its logs (journalctl -u mariadb or MySQL error log) if it fails to start.
      3. Test Manual Database Connection: From the LibreNMS server command line, try to connect to the database using the mysql client with the same credentials:
        mysql -h YOUR_DB_HOST -u YOUR_DB_USERNAME -pYOUR_DB_PASSWORD YOUR_DB_DATABASE
        # Example: mysql -h localhost -u librenms -pMyLibreNMSPassword librenms
        
        If this fails, the issue is likely with credentials, host, or user permissions.
      4. Check bind-address: In your MariaDB/MySQL configuration file (e.g., /etc/mysql/mariadb.conf.d/50-server.cnf), the bind-address directive controls which network interfaces the database server listens on.
        • If it's 127.0.0.1, it only accepts connections from localhost. This is fine if LibreNMS and the database are on the same server.
        • If LibreNMS (or distributed pollers) are on different hosts, bind-address must be 0.0.0.0 (listen on all interfaces) or the specific IP address of the interface LibreNMS connects through.
      5. Firewall Rules: Ensure your firewall (on the DB server if separate, or on the LibreNMS server if it's local) allows connections to the database port (e.g., TCP 3306) from the LibreNMS application server/pollers.
      6. Database User Permissions: Verify that the librenms database user has the necessary privileges on the librenms database from the correct host(s). Log into MySQL as root:
        SHOW GRANTS FOR 'librenms'@'localhost';
        -- If connecting from another host:
        -- SHOW GRANTS FOR 'librenms'@'your_librenms_app_server_ip';
        
        It should have ALL PRIVILEGES on the librenms database. If not, grant them:
        GRANT ALL PRIVILEGES ON librenms.* TO 'librenms'@'localhost' IDENTIFIED BY 'your_password';
        -- Or for a remote host:
        -- GRANT ALL PRIVILEGES ON librenms.* TO 'librenms'@'your_librenms_app_server_ip' IDENTIFIED BY 'your_password';
        FLUSH PRIVILEGES;
        
    • Fix: Correct credentials in .env, start the DB server, adjust bind-address and firewall, or fix DB user grants. Restart relevant services.
  • File Permissions Issues:

    • Symptom: Web UI shows errors about being unable to write to directories (logs, RRDs, cache); graphs are not updating; images/CSS/JS fail to load; ./validate.php reports permission errors.
    • Cause: Incorrect ownership or permissions for LibreNMS directories (/opt/librenms/rrd, /opt/librenms/logs, /opt/librenms/bootstrap/cache/, /opt/librenms/storage/). The librenms user and the web server user (e.g., www-data or nginx) need appropriate access.
    • Troubleshooting:
      1. Run ./validate.php from /opt/librenms/. It often detects and suggests fixes for permission issues.
      2. Carefully review the ownership and permissions steps in the official LibreNMS installation guide for your OS. Common commands involve chown -R librenms:librenms /opt/librenms and setfacl to grant group write access to specific subdirectories for the web server user (which should be in the librenms group).
        # Example ownership/permissions (consult official docs for exact, up-to-date commands)
        sudo chown -R librenms:librenms /opt/librenms
        sudo setfacl -d -m g::rwx /opt/librenms/rrd /opt/librenms/logs /opt/librenms/bootstrap/cache/ /opt/librenms/storage/
        sudo setfacl -R -m g::rwx /opt/librenms/rrd /opt/librenms/logs /opt/librenms/bootstrap/cache/ /opt/librenms/storage/
        # Ensure web server user (e.g., www-data) is in the librenms group:
        # sudo usermod -a -G librenms www-data
        
    • Fix: Apply the correct ownership and permissions as per the documentation. You might need to clear cache: sudo -u librenms php artisan cache:clear and sudo -u librenms php artisan config:clear.
  • Web Server Configuration Errors (Nginx/Apache):

    • Symptom: 403 Forbidden, 404 Not Found, 500 Internal Server Error when accessing LibreNMS UI; PHP code displayed instead of being executed.
    • Cause: Incorrect web server virtual host configuration (wrong root directory, incorrect PHP processing setup, alias issues, rewrite rules missing).
    • Troubleshooting:
      1. Compare your Nginx/Apache site configuration file for LibreNMS with the example provided in the official LibreNMS documentation. Pay close attention to:
        • root directive (should point to /opt/librenms/html).
        • server_name.
        • PHP block (location ~ \.php$ for Nginx, or SetHandler / ProxyPassMatch for Apache with PHP-FPM) ensuring it correctly passes PHP files to the PHP-FPM socket or mod_php.
        • index directive (should include index.php).
        • Any rewrite rules specified by LibreNMS.
      2. Test web server configuration: sudo nginx -t or sudo apachectl configtest.
      3. Check web server error logs and access logs for more details.
    • Fix: Correct the web server configuration file, then restart the web server.
  • Cron Job Not Running or Misconfigured:

    • Symptom: Devices not being polled (graphs flatline, "Last Polled" time doesn't update); new devices not discovered; alerts not triggering; daily.sh tasks not running. ./validate.php might warn about pollers or daily.sh not running.
    • Cause: The LibreNMS cron entry (/etc/cron.d/librenms) is missing, commented out, has incorrect syntax, or the cron daemon itself is not running. The user specified in the cron job might not have permissions to execute LibreNMS scripts.
    • Troubleshooting:
      1. Verify the cron file exists: cat /etc/cron.d/librenms. It should contain lines similar to:
        # Example, check official docs
        * * * * * librenms /opt/librenms/cronic /opt/librenms/poller-wrapper.py
        */5 * * * * librenms /opt/librenms/discovery-wrapper.py 1
        @daily librenms /opt/librenms/daily.sh >> /dev/null 2>&1
        # ... and others
        
        (Note: Modern LibreNMS often uses a single cron entry running laravel-scheduler every minute, which then manages all sub-tasks like polling and discovery: * * * * * librenms /opt/librenms/laravel-scheduler.php >> /dev/null 2>&1)
      2. Check cron daemon status: sudo systemctl status cron.
      3. Check system logs (e.g., /var/log/syslog or journalctl -u cron) for messages related to cron jobs being run (or failing). Search for "librenms" or the script names.
      4. Ensure the user in the cron file (usually librenms) can execute the scripts and has the correct environment (e.g., PATH).
      5. Manually run the scripts as the librenms user to see if they execute without error:
        sudo -u librenms /opt/librenms/poller-wrapper.py # Or ./poller.php -h <some_device_id>
        sudo -u librenms /opt/librenms/discovery.php -h all
        sudo -u librenms /opt/librenms/daily.sh
        
    • Fix: Create or correct the /etc/cron.d/librenms file using the example from LibreNMS documentation. Ensure cron daemon is running. Fix script permissions if necessary.

Device Polling/Discovery Failures

  • Symptom: Device status is "Down (SNMP)" or "Down (Ping)"; no data appears for a newly added device; device information (interfaces, sensors) is incomplete or outdated.
  • Cause:
    • SNMP misconfiguration on the target device (wrong community string, SNMP agent not running, version mismatch, ACLs blocking LibreNMS IP).
    • Firewall blocking SNMP (UDP 161) or ICMP (ping) between LibreNMS and the target device.
    • Network connectivity issues.
    • LibreNMS poller/discovery modules for that OS/device are disabled or have errors.
    • Device added with incorrect SNMP parameters in LibreNMS.
  • Troubleshooting:

    1. Basic Connectivity (Ping): From the LibreNMS server, try to ping the target device:
      ping TARGET_DEVICE_IP
      fping TARGET_DEVICE_IP # fping is often what LibreNMS uses
      
      If ping fails, resolve network connectivity or firewall issues first.
    2. SNMP Walk: This is the most crucial test. From the LibreNMS server command line, attempt an snmpwalk to the target device using the exact same credentials and version configured in LibreNMS for that device:
      • For SNMPv2c:
        snmpwalk -v2c -c YOUR_COMMUNITY_STRING TARGET_DEVICE_IP system
        # To get more data:
        # snmpwalk -v2c -c YOUR_COMMUNITY_STRING TARGET_DEVICE_IP .
        
      • For SNMPv3:
        snmpwalk -v3 -l authPriv -u YOUR_USER -a SHA -A 'YOUR_AUTH_PASS' -x AES -X 'YOUR_PRIV_PASS' TARGET_DEVICE_IP system
        
        If snmpwalk fails (timeout, no response, authentication failure):
        • Verify SNMP agent is running on the target device.
        • Check SNMP configuration on the target device (community string, v3 user credentials, allowed IPs/ACLs for SNMP).
        • Check firewalls on the target device and any intermediate network firewalls.
    3. LibreNMS Poller Debug: Run the poller for the specific device in debug mode:
      # As librenms user or using sudo -u librenms
      cd /opt/librenms
      ./poller.php -h HOSTNAME_OR_DEVICE_ID -d -r -f
      # -h: Hostname or device_id of the target device
      # -d: Enable debugging output
      # -r: Do not update RRDs (for testing)
      # -f: Do not fork (run in foreground)
      
      Look for errors related to SNMP, specific MIBs, or module execution.
    4. LibreNMS Discovery Debug: Run discovery for the specific device in debug mode:
      # As librenms user or using sudo -u librenms
      cd /opt/librenms
      ./discovery.php -h HOSTNAME_OR_DEVICE_ID -d
      # To debug specific discovery modules:
      # ./discovery.php -h HOSTNAME_OR_DEVICE_ID -d -m os,ports,sensors
      
      This will show what information discovery is gathering (or failing to gather).
    5. Check Device Settings in LibreNMS: Go to the device's Edit page in LibreNMS. Verify SNMP version, community/credentials, port, and selected poller/discovery modules are correct.
    6. Event Log: Check Logs > Event Log in LibreNMS for messages related to the device. Filter by the device hostname.
    7. librenms.log: Check /opt/librenms/logs/librenms.log for more detailed error messages from backend processes.
  • Fix: Correct SNMP settings on the device or in LibreNMS, adjust firewalls, fix network issues, or address module-specific problems identified in debug output.

Graphing Issues (No Data, Broken Graphs)

  • Symptom: Graphs show "No Data," are empty, display as broken images, or stop updating.
  • Cause:
    • Polling for the device/metric is failing (see above).
    • RRD files are not being created or updated (permission issues, rrdcached problems).
    • RRDtool itself is not installed correctly or has issues.
    • Graph definition errors (for custom graphs).
    • Incorrect RRD file paths or names.
    • Time synchronization issues between the poller and the web server (can affect graph rendering if RRDs seem to be in the future).
  • Troubleshooting:

    1. Verify Polling: First, ensure the device and the specific metrics are being polled successfully (use poller debug). If no data is collected, RRDs won't be updated.
    2. Check RRD Files:
      • Locate the RRD file for the problematic graph. Paths are typically /opt/librenms/rrd/HOSTNAME/METRIC.rrd (e.g., /opt/librenms/rrd/my-server/cpu-system.rrd).
      • Check file existence, ownership (librenms:librenms), and last modification time (ls -l). If not updated recently, polling/RRD writing is the issue.
      • Use rrdtool info METRIC.rrd to inspect its structure and last update time.
      • Use rrdtool fetch METRIC.rrd AVERAGE -s -10m to see recent data points.
    3. rrdcached Issues:
      • Ensure rrdcached is running: sudo systemctl status rrdcached.
      • Check its configuration in /etc/default/rrdcached and LibreNMS config.php ($config['rrdcached']).
      • Check rrdcached logs (if configured) or system logs for errors.
      • Try restarting rrdcached.
      • Permissions: rrdcached user must be able to write to the /opt/librenms/rrd directory and its journal directory.
    4. RRDtool Version: In LibreNMS Global Settings > System > General, ensure the selected RRDtool version matches the installed version (rrdtool -v).
    5. Web Server Logs: Check Nginx/Apache error logs for errors related to graph generation (e.g., rrdtool command failures).
    6. Permissions for Graph Generation: The web server user (e.g., www-data) needs read access to RRD files to generate graphs. If rrdcached is used for reads too, then PHP needs to connect to rrdcached.
    7. Time Synchronization: Ensure NTP is configured and working on the LibreNMS server (and pollers if distributed). Significant time drift can cause RRDtool issues.
    8. Broken Graph Image Icon: If you see a broken image icon, right-click and "Open image in new tab" or "Inspect element" to see the URL that failed. This URL is often a call to graph.php or similar. Try accessing it directly. The output might show an RRDtool error message.
    9. ./validate.php: Run this to check for common configuration or path issues.
  • Fix: Resolve polling issues, fix RRD file permissions, correct rrdcached setup, ensure RRDtool is working, or fix graph definitions.

Alerting Problems

  • Symptom: Alerts are not being triggered when conditions are met; notifications (email, Slack, etc.) are not being sent; too many false positive alerts.
  • Cause:
    • Alert rule misconfiguration (incorrect conditions, thresholds, device association).
    • Polling for the relevant metric is failing.
    • Alert transports (email, Slack) are not configured correctly or failing.
    • alerter.php (or the alerting component of the scheduler) is not running or has errors.
    • Delay settings in alert rules preventing immediate firing.
    • Timezone issues affecting scheduled checks or delays.
  • Troubleshooting:

    1. Verify Metric Data: First, confirm that the data point the alert rule is based on is being polled correctly and is visible in graphs with the expected problematic value. If the data isn't there, the alert can't trigger.
    2. Review Alert Rule Configuration:
      • Go to Alerts > Alert Rules, edit the problematic rule.
      • Double-check conditions, values, device/group associations.
      • Check the "Delay" setting – an alert only fires after the condition has been true for this duration.
      • Check if the rule is enabled.
    3. Check Alert Transports:
      • Go to Alerts > Transports.
      • Use the "Test Transport" button for the configured transport.
      • For email: ensure your server can send mail (check mail logs like /var/log/mail.log or Postfix/Exim logs). Verify SMTP settings in config.php if using an external relay.
      • For Slack/Telegram etc.: verify API tokens, webhook URLs, and channel IDs.
    4. Alert Test Rule: Create a very simple test alert rule that is easy to trigger (e.g., based on a sensor you can manipulate or a device you can temporarily make "down" by stopping its SNMP agent).
    5. Check Alert Log and Event Log:
      • Alerts > Alert Log: Shows triggered alerts and their history.
      • Logs > Event Log: Search for events related to the device and alert rule. It might show attempts to send notifications or errors.
    6. Alerter Process:
      • The alerting logic runs as part of the scheduled tasks. Ensure your cron jobs are running correctly.
      • Check /opt/librenms/logs/librenms.log for any errors related to alerting or the scheduler.
    7. Test Alert Rule from CLI (Advanced): You can use ./scripts/test-alert.php to test specific alert rules against devices.
      # sudo -u librenms ./scripts/test-alert.php -r <rule_id> -d <device_id_or_hostname>
      # Find rule_id from the URL when editing the rule, or in the 'alert_rules' table.
      
      This script will show you the data it evaluated and why the rule did or did not trigger.
    8. False Positives: If getting too many alerts, review rule thresholds, increase delay times, or refine conditions to be more specific. Use "Alert if condition is true for X minutes" to avoid alerts for transient spikes.
  • Fix: Correct alert rule logic, fix transport configurations, ensure polling and cron jobs are working.

Performance Bottlenecks

  • Symptom: Web UI is slow; pollers take longer than the polling interval (e.g., > 5 minutes); high CPU/memory/IO load on the LibreNMS server.
  • Cause: Insufficient server resources (CPU, RAM, disk I/O); unoptimized database; too many devices/pollers for a single instance; inefficient custom scripts or modules.
  • Troubleshooting:

    1. Monitor Server Resources: Use top, htop, vmstat, iostat, iotop on the LibreNMS server to identify CPU, memory, or I/O bottlenecks.
    2. Poller Performance:
      • Check Health > Poller Performance in the UI.
      • Identify which devices or modules are taking the longest to poll.
      • If total poller time exceeds the interval (e.g., 300 seconds), you have a problem.
      • Disable unused poller modules globally (Global Settings > Polling > Global Modules) or per-device.
    3. Database Performance:
      • Use mysqltuner.pl for recommendations.
      • Enable and analyze the slow query log.
      • Optimize MariaDB/MySQL configuration (innodb_buffer_pool_size, etc.) as detailed in the Performance Tuning section.
      • Ensure you have enough RAM for the InnoDB buffer pool.
    4. rrdcached: Ensure it's running and configured correctly. Slow RRD writes can cripple poller performance.
    5. Web Server Performance:
      • Tune PHP-FPM pm.max_children and other pool settings.
      • Optimize Nginx/Apache configuration (worker processes, keepalives).
      • Check web server and PHP-FPM logs.
    6. Number of Devices: If monitoring a very large number of devices, consider:
      • Distributed Pollers: To scale out polling load.
      • More powerful hardware for the central server and database.
    7. Network Latency: High latency to many monitored devices can slow down pollers.
    8. LibreNMS Updates: Ensure you are running a recent version of LibreNMS, as performance improvements are often made.
    9. Custom Code: If you have custom poller modules or scripts, profile them to ensure they are efficient.
  • Fix: Allocate more resources, tune database/webserver/PHP-FPM, implement distributed pollers, disable unnecessary polling.

General Debugging Tools and Logs

  • ./validate.php:

    # As librenms user or with sudo
    cd /opt/librenms
    ./validate.php
    
    This script checks many common configuration issues, file permissions, database schema, and dependencies. Always start here.

  • LibreNMS Log File: /opt/librenms/logs/librenms.log

    • This is the main application log. Contains detailed information from pollers, discovery, alerting, API, etc.
    • Increase log level in .env (e.g., APP_LOG_LEVEL=debug) for more verbose output during troubleshooting (remember to set it back to info or warning in production).
  • Poller and Discovery Debug Output (CLI):

    • ./poller.php -h <device> -d -r -f
    • ./discovery.php -h <device> -d -m <module>
    • These provide real-time output of what these scripts are doing.
  • Device Event Log (UI): Logs > Event Log (filter by device). Shows device status changes, polling errors, alert triggers for that device.

  • Global Event Log (UI): Logs > Event Log. Shows system-wide events.

  • Web Server Logs:

    • Nginx: /var/log/nginx/access.log and /var/log/nginx/error.log.
    • Apache: /var/log/apache2/access.log and /var/log/apache2/error.log.
  • PHP-FPM Logs:

    • Often in /var/log/phpYOUR_VERSION-fpm.log. Configured in PHP-FPM pool settings.
  • Database Logs:

    • MariaDB/MySQL error log (path varies, e.g., /var/log/mysql/error.log or /var/lib/mysql/HOSTNAME.err).
    • Slow query log (if enabled).
  • Cron Logs: System log (e.g., /var/log/syslog or journalctl -u cron) often shows cron job execution.

Workshop Troubleshooting Common Scenarios

Objective: To practice diagnosing and potentially fixing common LibreNMS issues using the tools and techniques discussed. This workshop is more thought-based and investigative.

Prerequisites:

  • A running LibreNMS instance.
  • Ability to access the LibreNMS server command line.
  • Familiarity with viewing logs and running basic commands.

Scenario 1: A Device Stops Graphing

  • Symptom: Graphs for "Server-X" have flatlined for the past hour. Last polled time is old.
  • Your Troubleshooting Steps (List them):
    1. Example: Check LibreNMS UI: Device status for Server-X? Any errors on its overview page?
    2. Example: Check Event Log in UI for Server-X around the time it stopped graphing. Any SNMP errors, timeouts?
    3. Example: SSH to LibreNMS server. Try ping Server-X_IP.
    4. Example: Try snmpwalk -v2c -c COMMUNITY Server-X_IP system (using correct credentials).
    5. Example: If snmpwalk fails, investigate Server-X: snmpd service running? Firewall? SNMP config correct?
    6. Example: If snmpwalk works, run poller debug: ./poller.php -h Server-X -d -r -f. Look for errors.
    7. Example: Check /opt/librenms/logs/librenms.log for errors related to Server-X or its polling.
    8. Example: Check RRD file for Server-X: /opt/librenms/rrd/Server-X/some_metric.rrd. Last updated time? Permissions?
    9. Example: Check rrdcached status and logs if it's in use.

Scenario 2: Email Alerts Not Being Received

  • Symptom: You configured an alert rule for "Device Down" and an email transport. You simulated a device going down, the alert triggered in LibreNMS UI, but no email arrived.
  • Your Troubleshooting Steps (List them):
    1. Example: In LibreNMS UI, go to Alerts > Transports. Click "Test Transport" for the email transport. Does the test succeed or fail?
    2. Example: Check your email spam/junk folder.
    3. Example: If test transport fails or no test email: check mail server configuration on LibreNMS host. Is Postfix/Exim running?
    4. Example: Check mail logs on LibreNMS server (e.g., /var/log/mail.log or journalctl -u postfix). Any errors related to sending mail to your address?
    5. Example: If using external SMTP relay in config.php, double-check all SMTP settings (host, port, user, password, security TLS/SSL).
    6. Example: Check /opt/librenms/logs/librenms.log for errors when the actual alert tried to send a notification.
    7. Example: Verify the alert rule is indeed associated with the correct (and tested) email transport.
    8. Example: Ensure the "Default from email address" and "Default contact email address" in Global Settings > Alerting > General are sensible.

Scenario 3: Web UI is Very Slow

  • Symptom: Loading dashboards, device pages, or graphs in the LibreNMS web UI takes a very long time.
  • Your Troubleshooting Steps (List them):
    1. Example: On LibreNMS server, run top or htop. Is CPU maxed out? Is memory full (swapping heavily)? High disk I/O wait (%wa)?
    2. Example: Identify processes consuming most resources. Is it mysqld, php-fpm, nginx/apache, or rrdcached?
    3. Example: If mysqld is high: Check MariaDB/MySQL error log. Enable slow query log. Run mysqltuner.pl.
    4. Example: If php-fpm is high: Check PHP-FPM error log. Are there enough child processes (pm.max_children)? Is a specific PHP script consuming resources?
    5. Example: Check web server logs for errors or long response times in access logs.
    6. Example: Check poller performance (Health > Poller Performance). Is the poller run taking too long, potentially impacting DB or server load?
    7. Example: How many devices are being monitored? Is the server undersized for the load?
    8. Example: Run ./validate.php. Any warnings or errors?
    9. Example: Check browser developer tools (Network tab) to see which requests are slow. Are they API calls, graph images, static assets?

Deliverables/Reflection:

  • For each scenario, a plausible list of troubleshooting steps in logical order.
  • Increased understanding of where to look for clues when LibreNMS misbehaves.

Troubleshooting is often a process of elimination. By systematically checking logs, configurations, and using debugging tools, you can usually pinpoint the root cause of most LibreNMS issues. The LibreNMS community forums and documentation are also excellent resources when you get stuck.

Conclusion

Throughout this comprehensive guide, we've journeyed from the foundational concepts of LibreNMS, through basic and intermediate setup and management, to advanced customization, optimization, and troubleshooting. You've learned how to install LibreNMS, add devices using SNMP, configure alerts, navigate its interface, and extend its capabilities. We've also delved into crucial aspects like performance tuning with distributed pollers and database optimization, securing your installation, and systematically resolving common issues.

LibreNMS is a powerful and flexible open-source network monitoring system. By self-hosting it, you gain complete control over your monitoring data and environment, offering invaluable insights into the health and performance of your network and servers. The skills you've developed here – covering Linux server administration, network protocols, database management, and specific LibreNMS configurations – are highly transferable and valuable in any IT environment.

Monitoring is not a "set it and forget it" task. It requires ongoing attention, refinement of alert rules, adaptation to new devices and services, and regular maintenance of the LibreNMS platform itself. The workshops provided practical, hands-on experience, which is key to solidifying your understanding.

As you continue to use and explore LibreNMS, remember that it has a vibrant community. Don't hesitate to consult the official documentation, participate in forums, and explore the wealth of shared knowledge. The world of network monitoring is ever-evolving, and LibreNMS, with its active development, is well-positioned to adapt and grow.

You are now well-equipped to deploy, manage, and leverage LibreNMS effectively. Use this knowledge to build robust monitoring solutions, proactively identify and resolve IT issues, and contribute to the stability and efficiency of the networks and systems you manage.

Further Learning Resources

To continue your journey and deepen your expertise with LibreNMS, here are some valuable resources:

  1. Official LibreNMS Documentation:

    • URL: https://docs.librenms.org/
    • This is the primary source of truth for installation, configuration, and feature explanations. It's regularly updated.
  2. LibreNMS Community Forum:

    • URL: https://community.librenms.org/
    • An excellent place to ask questions, share solutions, and learn from other users' experiences. You can find discussions on specific devices, custom configurations, and troubleshooting.
  3. LibreNMS GitHub Repository:

    • URL: https://github.com/librenms/librenms
    • Explore the source code, report bugs, submit feature requests, or even contribute to the project. The "Issues" and "Pull Requests" sections are insightful.
  4. LibreNMS Discord/IRC:

    • Check the community page for links to real-time chat channels (Discord is commonly used). Useful for quick questions and discussions.
  5. Understanding SNMP:

    • Search for "SNMP tutorial" or "Understanding MIBs and OIDs." A deeper knowledge of SNMP will greatly enhance your ability to customize LibreNMS.
    • RFCs for SNMP (e.g., RFC 3411-3418 for SNMPv3) provide the authoritative specifications.
  6. RRDtool Documentation:

  7. MySQL/MariaDB Tuning Guides:

    • Official MariaDB and MySQL documentation.
    • Blogs from Percona and other database experts often have excellent articles on performance tuning.
  8. Web Server Documentation (Nginx/Apache):

    • Official Nginx and Apache documentation for web server optimization and PHP integration.

By continuously learning and experimenting, you can become a LibreNMS power user and a valuable asset in managing complex IT infrastructures.