Skip to content
Author Nejat Hakan
eMail nejat.hakan@outlook.de
PayPal Me https://paypal.me/nejathakan


Process Management

Introduction What is a Process

Welcome to the world of Linux process management! This is a fundamental area of any operating system, and understanding it is crucial for system administrators, developers, and power users alike. Before we dive deep, let's clarify the most basic concept: What exactly is a process?

Often, people use the terms "program" and "process" interchangeably, but in operating systems terminology, they have distinct meanings.

  • Program: A program is a passive entity. It's a collection of instructions and associated data stored on a disk (or other storage medium) in an executable file format (like ELF - Executable and Linkable Format - on Linux). Think of it as a recipe written down in a cookbook. It exists, it has instructions, but it's not doing anything on its own. Examples include the /bin/bash executable file, the /usr/bin/firefox executable, or a compiled C program you wrote.

  • Process: A process is an active instance of a program being executed. It's the recipe being actively cooked in the kitchen. When you run a program, the Linux kernel loads the program's instructions and data into memory, allocates system resources, and begins executing the instructions. This active entity, with its own memory space, resources, and execution state, is a process. You can have multiple processes running the same program concurrently (e.g., multiple terminal windows each running the bash program result in multiple bash processes).

Key Components of a Process

When the kernel creates a process, it associates several key components and pieces of information with it, managed within the kernel's data structures (often referred to conceptually as the Process Control Block or PCB, though Linux uses task_struct):

  1. Process ID (PID): A unique positive integer assigned by the kernel to identify the process. PID 1 is special; it's typically the init or systemd process, the ancestor of most user-space processes. New PIDs are usually assigned sequentially, wrapping around when the maximum value is reached (configurable, often 32768 or higher).
  2. Parent Process ID (PPID): The PID of the process that created this process. Processes form a hierarchy, and the PPID links a child process back to its parent.
  3. Process State: Indicates what the process is currently doing (e.g., running, sleeping, stopped). We'll cover states in detail later.
  4. Program Counter (PC): Stores the memory address of the next instruction to be executed for this process.
  5. CPU Registers: Contains the values of the processor's registers when the process was last running. These need to be saved when the process is interrupted and restored when it resumes.
  6. Memory Management Information: Details about the process's virtual address space, including pointers to its code, data, stack segments, and page tables used by the kernel to map virtual to physical memory. Each process typically gets its own private virtual address space, providing memory protection.
  7. User ID (UID) and Group ID (GID): Identifies the user and group that own the process. These IDs are used by the kernel to determine the process's permissions for accessing files and other system resources. Effective UID/GID and Real UID/GID can sometimes differ, related to concepts like setuid programs.
  8. Open File Descriptors: A list of files (including network sockets, pipes, etc.) that the process currently has open. Represented as small non-negative integers (0 for stdin, 1 for stdout, 2 for stderr by convention).
  9. Scheduling Information: Includes the process's priority, consumed CPU time, and other data used by the kernel's scheduler to decide when and for how long the process should run on the CPU.
  10. Signal Mask and Pending Signals: Information about which signals the process is blocking and which signals have been sent to it but not yet delivered or handled.

In essence, a process is much more than just program code; it's the program in execution, bundled with all the context, resources, and kernel-managed information required for it to run and interact with the system. Understanding this distinction and the components involved is the first step towards mastering process management in Linux.

Workshop Identifying Your Shell Process

Let's start with a simple, practical task: identifying the process associated with the very terminal shell you are likely using right now.

Objective: Find the Process ID (PID) and Parent Process ID (PPID) of your current shell.

Steps:

  1. Open Your Terminal: If you don't already have one open, launch a terminal window. You are now interacting with a shell process (likely bash, zsh, or similar).

  2. Identify the Shell's PID: The shell itself provides a special variable $$ that expands to its own PID. Use the echo command to display it:

    echo $$
    

    • Explanation: echo is a command that prints its arguments to the standard output. $$ is a special parameter in most shells that the shell replaces with its own Process ID before executing the command.
    • Output: You will see a number, for example, 12345. This is the PID of your current shell process. Note this number down.
  3. Use ps to Verify and Find the PPID: The ps command is the primary tool for viewing processes. We'll use it to find the process with the PID you just identified and look at its details, including the PPID.

    ps -p <PID> -o pid,ppid,cmd
    

    • Replace <PID> with the actual number you got from echo $$. For example, if your PID was 12345, you'd run: ps -p 12345 -o pid,ppid,cmd
    • Explanation:
      • ps: The command to report a snapshot of current processes.
      • -p <PID>: This option tells ps to only show information about the process with the specified PID.
      • -o pid,ppid,cmd: This option specifies a custom output format. We are asking ps to show only the pid (Process ID), ppid (Parent Process ID), and cmd (the command being run, possibly truncated).
    • Output: You should see something like this (your numbers and command might differ):
        PID  PPID CMD
      12345 12340 bash
      
      This confirms the PID you found earlier and shows you the PPID. The CMD column shows the name of the program being executed (bash in this example).
  4. (Optional) Find the Parent Process: Now you know the PPID. You can use ps again to see what that parent process is:

    ps -p <PPID> -o pid,ppid,cmd
    

    • Replace <PPID> with the PPID you found in the previous step.
    • Output: You might see your terminal emulator program (like gnome-terminal-server, konsole, etc.), or perhaps another shell if you started this shell from another one. This demonstrates the parent-child relationship.

Summary: In this workshop, you learned how to find the unique identifier (PID) of your current shell process using the $$ variable and confirmed it using the ps command. You also discovered the PID of its parent process (PPID), illustrating the process hierarchy.


1. Process States and Lifecycle

A process doesn't just run constantly; it transitions through various states during its lifetime, reflecting its current activity and interaction with the kernel and hardware. Understanding these states is crucial for diagnosing performance issues and managing system resources.

The Process Lifecycle

A typical process lifecycle involves the following phases:

  1. Creation: A process is born when an existing process makes a fork() system call. The fork() call creates a near-identical copy of the calling process (the parent), resulting in a new process (the child). The child inherits many attributes from the parent but gets its own unique PID. Often, the child process will then use an exec() family system call to replace its memory image with a new program, effectively starting a different task.
  2. Scheduling: Once created, the process enters the pool of runnable processes. The kernel's scheduler decides which runnable process gets to use the CPU next based on scheduling algorithms and process priorities.
  3. Execution: The process runs on the CPU, executing its instructions. During execution, it might:
    • Run until its allotted time slice expires.
    • Voluntarily give up the CPU by making a blocking system call (e.g., reading from a file, waiting for network input).
    • Be preempted by the scheduler if a higher-priority process becomes ready.
  4. Blocking/Sleeping: If a process needs to wait for an event (e.g., I/O completion, a signal, availability of data), it enters a sleeping or blocked state. It won't be considered for CPU execution until the event it's waiting for occurs.
  5. Termination: A process can terminate in several ways:
    • Normal Exit: The process completes its task and calls the exit() system call (often invoked implicitly when main() returns in C programs). It provides an exit status code (0 typically indicates success, non-zero indicates an error).
    • Error Exit: The process encounters an unrecoverable error and terminates itself, usually with a non-zero exit status.
    • Killed by Signal: The process receives a signal that causes it to terminate (e.g., SIGTERM, SIGKILL).

Common Process States in Linux

The ps command often displays a state code (e.g., in the STAT or S column). Here are the most common states:

  • R (Running or Runnable):

    • Running: The process is currently executing instructions on a CPU core.
    • Runnable: The process is ready to run and is waiting in a queue for the scheduler to assign it a CPU core. From the perspective of ps, both are usually shown as R because the state can change very rapidly. A process consuming significant CPU time will likely show as R.
  • S (Interruptible Sleep):

    • The process is waiting for an event to complete or a resource to become available. Examples include waiting for terminal input, data from a network socket, or a timer to expire.
    • It's "interruptible" because it can be woken up prematurely by a signal.
    • This is a very common state for processes that are not actively doing computation (e.g., idle shells, web servers waiting for requests).
  • D (Uninterruptible Sleep):

    • Similar to S, the process is waiting, typically for I/O operations (like reading from or writing to a disk) to complete directly with hardware.
    • It's "uninterruptible" because it will not respond to most signals while in this state. This is necessary to prevent data corruption if the process were interrupted during a critical hardware interaction.
    • Processes stuck in the D state for long periods can indicate hardware problems or driver issues (especially with storage or network interfaces). Killing a process in state D is often impossible without resolving the underlying I/O issue or rebooting.
  • T (Stopped or Traced):

    • Stopped: The process has been suspended, usually by receiving a specific signal like SIGSTOP or SIGTSTP (the latter is typically generated when you press Ctrl+Z in the terminal). It will not run until it receives a SIGCONT signal.
    • Traced: The process is being monitored by another process, such as a debugger (like gdb). Execution is suspended each time the debugger needs to inspect it.
  • Z (Zombie or Defunct):

    • The process has terminated (it called exit() or was killed), but its entry in the kernel's process table still exists.
    • Why? Because it contains information, primarily the process's exit status code, that the parent process might need to collect.
    • The parent process is expected to call a wait() family system call to "reap" the zombie child. This call retrieves the child's exit status and allows the kernel to finally remove the zombie process entry.
    • A small number of zombie processes existing for a very short time is normal. However, if a parent process fails to call wait() (e.g., due to poor programming or the parent crashing), its zombie children will remain indefinitely until the parent terminates (at which point they might be reparented to init/systemd, which does reap zombies) or the system reboots.
    • Zombies consume very few resources (mainly just the process table slot), but a large accumulation could potentially exhaust the PID space or process table slots. You cannot kill a zombie process directly (it's already dead); you must fix or kill the parent process that is failing to reap it.

State Modifiers (Often seen appended to the main state character in ps output):

  • <: High-priority process (negative nice value).
  • N: Low-priority process (positive nice value).
  • L: Pages locked into memory (often for real-time operations).
  • s: Session leader (a process that started a new session).
  • l: Multi-threaded process (using POSIX threads).
  • +: Process is in the foreground process group of its controlling terminal.

Understanding these states helps you interpret the output of monitoring tools like ps and top, allowing you to quickly assess what processes are doing on your system.

Workshop Observing Process States

In this workshop, we'll create processes and manipulate them to observe different states using the ps command.

Objective: Observe processes in the Running (R), Interruptible Sleep (S), Stopped (T), and Zombie (Z) states (though creating a persistent Zombie can be tricky without specific code).

Tools: ps, sleep, kill, shell job control (Ctrl+Z, bg, fg).

Steps:

  1. Observe Interruptible Sleep (S):

    • The sleep command simply pauses for a specified number of seconds. It spends most of its time waiting for a timer signal, putting it into interruptible sleep.
    • Open a terminal and run:
      sleep 600 &
      
      • Explanation: sleep 600 tells the command to pause for 600 seconds (10 minutes). The & puts the command into the background, so you immediately get your shell prompt back. The shell will print the job number and PID of the background process. Note the PID.
    • Now, use ps to check its state. Use the PID reported when you started the background job.
      ps -p <PID> -o pid,state,cmd
      
      • Replace <PID> with the actual PID of the sleep process.
    • Output: You should see something like:
        PID S CMD
      13579 S sleep 600
      
      The S indicates it's in Interruptible Sleep, waiting for the timer.
  2. Observe Stopped (T):

    • We need a foreground process to stop with Ctrl+Z. Let's use sleep again, but this time in the foreground.
    • In the same terminal, run:
      sleep 300
      
      • Your terminal will now pause.
    • Press Ctrl+Z.
    • Explanation: Ctrl+Z sends the SIGTSTP signal (Terminal Stop) to the foreground process.
    • Output: The shell usually prints a message like [1]+ Stopped sleep 300. It also gives you your prompt back.
    • Find the PID of this stopped sleep process. You can use the jobs command:
      jobs -l
      
      • This lists background/stopped jobs with their PIDs. Find the sleep 300 job and note its PID.
    • Check its state with ps:
      ps -p <PID> -o pid,state,cmd
      
      • Replace <PID> with the PID of the stopped sleep 300 process.
    • Output:
        PID S CMD
      13588 T sleep 300
      
      The T indicates it's Stopped.
    • Resume the process: You can resume it in the background with bg %<job_number> (e.g., bg %1) or in the foreground with fg %<job_number>. If you resume it in the background, check its state again with ps – it should return to S. Let's bring it to the foreground and terminate it.
      fg %1 # Or whatever the job number is
      # Now press Ctrl+C to send SIGINT and terminate it
      
  3. Observe Running (R):

    • Creating a process that consistently stays in the R state usually requires a CPU-intensive task. A simple infinite loop can demonstrate this.
    • Run the following command in the background:
      yes > /dev/null &
      
      • Explanation: The yes command continuously outputs 'y' (followed by a newline) very rapidly. We redirect its output (>) to /dev/null (a special file that discards everything written to it) so it doesn't flood your terminal. The & puts it in the background. Note the PID.
    • Quickly check its state with ps (it might share the CPU, so it might flip between R and S, but should show R frequently):
      ps -p <PID> -o pid,state,cmd,pcpu
      
      • Replace <PID> with the PID of the yes process. We added %cpu to see CPU usage.
    • Output:
        PID S CMD                           %CPU
      13601 R yes                           99.9
      
      You should see state R and high CPU usage.
    • Terminate the process: Find its PID (if you didn't note it) using ps aux | grep yes or pgrep yes, then kill it forcefully:
      kill -9 <PID> # Or use: pkill yes
      
  4. Observing Zombie (Z):

    • Reliably creating a long-lived zombie usually requires a custom program where the parent process forks a child, the child exits immediately, and the parent deliberately doesn't call wait(). Doing this purely from the shell is difficult.
    • However, you can look for existing zombie processes on your system (there might not be any).
    • Use ps to find processes in the Z state:
      ps aux | grep ' Z '
      
      • Explanation: ps aux lists all processes with user-oriented format. grep ' Z ' filters for lines containing Z (with spaces to better match the state column and avoid matching other things).
    • Output: If you have zombie processes, you'll see output like:
      USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
      someuser 14001  0.0  0.0      0     0 ?        Z    10:30   0:00 [someprocess] <defunct>
      
      The <defunct> tag is characteristic of zombie processes in ps output. Note the state Z. If you see this, the parent process (whose PID is the PPID of the zombie) needs to be investigated or potentially killed/restarted to clean it up.

Summary: In this workshop, you used common Linux commands (sleep, yes) and shell features (&, Ctrl+Z) to create processes in different states (S, T, R). You then used ps with specific options to observe and verify these states. You also learned how to search for problematic Zombie (Z) processes. This practical experience reinforces the theoretical understanding of the process lifecycle.


2. Viewing Processes

Being able to inspect the processes running on your Linux system is a fundamental skill. It allows you to monitor resource usage (CPU, memory), identify applications, troubleshoot problems, and manage system performance. Several command-line tools are available for this purpose, with ps, top, and htop being the most common.

The ps Command

ps (Process Status) provides a snapshot of the currently running processes at the moment you execute the command. It's highly flexible due to its numerous options.

Common ps Usage Styles:

There are two main styles of options for ps, stemming from its history:

  1. BSD Style: Options are used without a leading dash (-). Often combined.

    • ps aux: Probably the most common invocation.
      • a: Show processes for all users (not just your own).
      • u: Display user-oriented format (shows owner, CPU%, MEM%, etc.).
      • x: Show processes not attached to a terminal (daemon processes).
    • Output Columns (Typical for aux):
      • USER: The user owning the process.
      • PID: Process ID.
      • %CPU: Approximate percentage of CPU time used by the process since it started (can be misleading for short-lived processes) or averaged over a small interval depending on the kernel/ps version.
      • %MEM: Approximate percentage of physical memory (RAM) used by the process.
      • VSZ: Virtual Memory Size (in KiB). Total virtual address space used by the process.
      • RSS: Resident Set Size (in KiB). The portion of the process's memory currently held in physical RAM (non-swapped).
      • TTY: Controlling terminal associated with the process. ? usually means no controlling terminal (daemon).
      • STAT: Process state code (e.g., R, S, T, Z, plus modifiers like s, +, l).
      • START: Time the process started.
      • TIME: Cumulative CPU time consumed by the process (user + system time). Format is often HH:MM:SS or MM:SS.ms.
      • COMMAND (or CMD): The command that launched the process (might be truncated or show kernel thread names in brackets).
  2. System V / UNIX Style: Options are preceded by a single dash (-).

    • ps -ef: Another very common invocation, often preferred by those with a System V background.
      • -e: Show every process on the system. Equivalent to a + x in BSD style.
      • -f: Full format listing. Shows UID, PID, PPID, C (CPU usage, but often less granular than %CPU), STIME (start time), TTY, TIME, CMD.
    • ps -eF: Extra full format (adds SZ, RSS, PSR - processor).
    • ps -ejH: Show processes in a hierarchy (like a tree). j is job format, H shows the hierarchy.
    • ps -eL: Show threads (Light Weight Processes, LWP) as separate entries. Each thread gets its own LWP ID but shares the same PID.

Customizing ps Output:

The -o option (or o in BSD style) allows you to specify exactly which columns you want to see.

  • ps -p <PID> -o pid,ppid,user,%cpu,%mem,stat,start,time,cmd: Show specific columns for a given PID.
  • ps axo pid,comm,pcpu,pmem --sort=-pcpu | head -n 10: Show PID, command name, CPU%, Mem% for all processes, sorted by CPU usage (descending), and show only the top 10.

The top Command

top provides a dynamic, real-time view of the running system, focusing on resource-intensive processes. It refreshes the display every few seconds (default is usually 3 seconds).

top Interface:

  • Summary Area (Top Lines):

    • System time, uptime, number of users logged in.
    • Load average: System load over the last 1, 5, and 15 minutes (a measure of how many processes are running or waiting to run).
    • Tasks: Total number of processes, broken down by state (running, sleeping, stopped, zombie).
    • CPU States (%Cpu(s)): Percentage of CPU time spent in various states (us: user, sy: system, ni: nice, id: idle, wa: I/O wait, hi: hardware interrupts, si: software interrupts, st: steal time - for VMs).
    • Memory Usage (KiB Mem, KiB Swap): Total, free, used, buff/cache for physical memory (RAM) and swap space.
  • Process List Area:

    • A list of processes, typically sorted by CPU usage by default.
    • Columns are similar to ps (PID, USER, PR [Priority], NI [Nice value], VIRT [Virtual Mem], RES [Resident Mem], SHR [Shared Mem], S [State], %CPU, %MEM, TIME+ [CPU Time, higher precision], COMMAND).

Interactive top Commands (Press while running):

  • h or ?: Display help screen.
  • q: Quit top.
  • Space: Force an immediate refresh.
  • M: Sort by memory usage (%MEM).
  • P: Sort by CPU usage (%CPU) (default).
  • T: Sort by cumulative CPU time (TIME+).
  • k: Kill a process (prompts for PID and signal).
  • r: Renice a process (prompts for PID and nice value).
  • u: Filter by user (prompts for username).
  • 1: Toggle display of individual CPU core stats (if multiple cores).
  • f: Enter field management screen to add/remove/reorder columns.
  • o: Enter filter screen to add criteria (e.g., COMMAND=bash).
  • z: Toggle color display.
  • x: Highlight sorting column.

The htop Command

htop is an enhanced, interactive process viewer often considered more user-friendly than top. It provides a similar dynamic view but adds features like:

  • Colorized display.
  • Scrolling horizontally and vertically through the process list.
  • Easier process manipulation (killing, renicing) often using function keys.
  • Visual meters for CPU (per core), memory, and swap usage at the top.
  • Tree view (press F5).
  • Direct process searching/filtering (press F4).
  • Setup menu (F2) for customization.

If htop is not installed, you can usually install it using your distribution's package manager (e.g., sudo apt update && sudo apt install htop on Debian/Ubuntu, sudo dnf install htop on Fedora/CentOS/RHEL).

Choosing the Right Tool:

  • Use ps when you need a specific snapshot, want highly customized output for scripting, or need to see process hierarchy easily (ps auxf, ps -ejH, pstree).
  • Use top for a standard, real-time overview of system load and resource hogs, available on almost any Linux system.
  • Use htop for a more interactive, user-friendly real-time monitoring experience with easier navigation and built-in features like tree view and searching (if available).

Workshop Monitoring System Processes

In this workshop, you'll practice using ps, top, and htop to explore the processes running on your system and identify resource usage.

Objective: Use process viewing tools to find specific processes, sort them by resource usage, and understand the output.

Prerequisites: htop should ideally be installed (sudo apt install htop or sudo dnf install htop).

Steps:

  1. Basic ps Exploration:

    • Open a terminal.
    • Get a full listing of all processes in user-oriented format:
      ps aux
      
      • Scroll through the list. Can you identify your shell process? Your desktop environment processes (e.g., gnome-shell, plasmashell)? System services (e.g., systemd, sshd, cron)?
    • Get a full listing showing the parent-child relationship:
      ps -ef
      
      • Look at the PID and PPID columns. Can you trace the lineage of a few processes back towards PID 1 (init or systemd)?
    • Find only the processes owned by your user:
      ps ux # Note: Without 'a', it only shows your processes on the current terminal
      # OR more reliably:
      ps -u $USER -f # Show your processes across all terminals in full format
      # $USER is an environment variable holding your username
      
  2. Using ps for Specific Information:

    • Find the PID of the main systemd process (PID 1):
      ps -p 1 -o pid,comm,stat
      
    • List all processes and show only their PID, command name, CPU usage, and memory usage. Sort by memory usage (highest first) and show the top 5:
      ps axo pid,comm,%mem,%cpu --sort=-%mem | head -n 6 # head -n 6 to include header
      
    • List all processes and show PID, command name, CPU usage, and memory usage. Sort by CPU usage (highest first) and show the top 5:
      ps axo pid,comm,%mem,%cpu --sort=-%cpu | head -n 6
      
  3. Interactive Monitoring with top:

    • Start top:
      top
      
    • Observe the summary area: Note the load average, CPU usage breakdown, and memory usage.
    • Identify the top CPU-consuming processes (they are usually at the top by default).
    • Sort by Memory Usage: Press Shift+M. Observe how the list reorders. Note the processes using the most RAM (%MEM and RES columns).
    • Sort back by CPU Usage: Press Shift+P.
    • Filter by Your User: Press u, type your username, and press Enter. Now top only shows your processes. Press u, leave the prompt empty, and press Enter to show all users again.
    • Toggle Individual CPU View: Press 1. Observe the %Cpu(s) line change to show individual core statistics (%Cpu0, %Cpu1, etc.). Press 1 again to toggle back.
    • Leave top: Press q.
  4. Enhanced Monitoring with htop (If installed):

    • Start htop:
      htop
      
    • Compare the interface to top. Note the graphical meters and function key hints at the bottom.
    • Use the Arrow Keys (Up, Down, Left, Right) to navigate the process list.
    • Sort by Memory: Press F6 (SortBy), use arrow keys to select PERCENT_MEM, press Enter.
    • Sort by CPU: Press F6, select PERCENT_CPU, press Enter.
    • Show Tree View: Press F5 (Tree). See how processes are nested under their parents. Press F5 again to toggle back to the sorted list.
    • Filter Processes: Press F4 (Filter), type a command name (e.g., bash), press Enter. Only processes matching the filter are shown. Press F4 again, clear the filter, and press Enter to see all processes.
    • Search for a Process: Press F3 (Search), type part of a command name, press Enter. htop will jump to the next match. Press F3 again to find the next one.
    • (Optional - Be Careful!) Kill a safe process: Start sleep 1000 & in another terminal. In htop, find the sleep process (you might need to filter or search). Use the arrow keys to highlight it. Press F9 (Kill). The default signal is 15 SIGTERM. Press Enter to send it. The sleep process should disappear. If you started yes > /dev/null & earlier, you could try killing that with F9 and selecting signal 9 SIGKILL.
    • Access Setup: Press F2 (Setup). Explore the options for customizing meters, display options, and columns. Press F10 (Done) to exit setup.
    • Leave htop: Press F10 (Quit) or q.

Summary: This workshop provided hands-on experience with ps, top, and htop. You learned how to get different views of the process list, extract specific information, sort processes based on resource consumption, and use the interactive features of top and htop for real-time monitoring and basic management tasks.


3. Process Creation

Understanding how new processes come into existence in Linux is fundamental. Unlike some operating systems, Linux primarily uses a combination of two distinct system calls: fork() and exec().

The fork() System Call

  • Purpose: To create a new process.
  • Mechanism: When a process (the parent) calls fork(), the kernel creates a nearly identical copy of that process (the child). This includes copies of:
    • The parent's virtual address space (code, data, stack). Initially, this is often done using Copy-on-Write (CoW). This means the physical memory pages are not actually copied immediately. Both parent and child initially share the same physical pages, marked as read-only. Only when one process attempts to write to a shared page does the kernel create a private copy of that specific page for the writing process. This makes fork() very efficient, especially if the child immediately calls exec().
    • CPU registers (state).
    • Open file descriptors. Both parent and child initially share the same underlying open file table entries in the kernel. This means they point to the same position within a file, which can be important for coordination or lead to unexpected behavior if not managed.
    • Signal handling settings.
    • Current working directory.
    • Resource limits.
  • Return Value: This is how the parent and child differentiate themselves after the call:
    • In the parent process, fork() returns the PID of the newly created child process.
    • In the child process, fork() returns 0.
    • If the fork() call fails (e.g., due to resource limits like exceeding the maximum number of processes), it returns -1 in the parent, and no child process is created.
  • Execution: After a successful fork(), both the parent and child processes continue executing from the instruction immediately following the fork() call. They use the return value to determine which code path to take (parent-specific or child-specific).

The exec() Family of System Calls

  • Purpose: To replace the current process image with a new program.
  • Mechanism: When a process calls one of the exec() functions (like execl(), execv(), execle(), execve(), execlp(), execvp()), the kernel does the following:
    • Loads the specified program file into the calling process's address space, overwriting the existing code, data, and stack segments.
    • Resets the process's execution state.
    • Starts executing the new program from its entry point (main() function in C).
  • Key Point: The exec() call, if successful, does not return. The original program code is gone, replaced entirely by the new program. The process itself continues to exist (it keeps the same PID, PPID, open files unless marked close-on-exec, etc.), but it's now running different code. If exec() does return, it means an error occurred (e.g., program not found, permission denied).
  • Variants: The different exec functions vary in how they accept arguments:
    • l (e.g., execl, execlp): Arguments are passed as a null-terminated list directly in the function call (const char *arg0, const char *arg1, ..., NULL).
    • v (e.g., execv, execvp): Arguments are passed as a null-terminated array of character pointers (char *const argv[]).
    • p (e.g., execlp, execvp): The system searches for the executable file in the directories specified by the PATH environment variable if the filename doesn't contain a slash (/).
    • e (e.g., execle, execve): Allows specifying the environment variables for the new program image explicitly via an array (char *const envp[]). execve is the underlying system call that the others often wrap.

The Common Pattern: fork() followed by exec()

This is the standard way shells and many other programs launch new applications in Linux/Unix:

  1. The parent process (e.g., your bash shell) wants to run a command (e.g., ls -l).
  2. The parent calls fork(). Now there are two identical shell processes.
  3. The parent process receives the child's PID from fork(). It might:
    • Wait for the child to complete using wait() or waitpid() (if running the command in the foreground).
    • Continue doing other things (if running the command in the background using &).
  4. The child process receives 0 from fork(). It knows it's the child.
  5. The child process typically calls one of the exec() functions (e.g., execvp("ls", ["ls", "-l", NULL])).
  6. The kernel replaces the child process's shell image with the ls program image. The child process (still with the same PID it got after fork()) now runs ls -l.
  7. When ls finishes, it calls exit().
  8. The parent process (if waiting) gets notified by the kernel via wait(), collects the child's exit status, and continues.

The Init Process (PID 1)

  • The very first user-space process started by the kernel during boot has PID 1.
  • Traditionally, this was the init process. In modern Linux distributions, it's usually systemd.
  • PID 1 is the ultimate ancestor of almost all other user-space processes.
  • It has special responsibilities, including:
    • Starting essential system services and managing them.
    • Adopting Orphaned Processes: If a parent process terminates before its children, those children become "orphaned." The kernel automatically reparents them to PID 1.
    • Reaping Zombies: PID 1 is programmed to periodically call wait() to clean up any zombie processes that it has adopted (or any of its own direct children that become zombies). This prevents the long-term accumulation of zombies whose original parents died.

Understanding fork() and exec() explains why creating a new process doesn't automatically mean running a new program, and why running a new program usually involves creating a new process first. This model provides flexibility and resource efficiency (thanks to CoW).

Workshop Launching and Observing Child Processes

This workshop focuses on observing the fork/exec pattern in action using shell commands and process viewing tools.

Objective: Launch processes in the background and foreground, observe their PIDs and PPIDs, and see the parent-child relationship using ps.

Tools: Shell (bash, zsh, etc.), sleep, ps.

Steps:

  1. Identify Your Shell's PID:

    • Remind yourself of your current shell's PID:
      echo $$
      
      • Let's say your shell's PID is 20000.
  2. Run a Command in the Foreground:

    • Execute a simple command:
      ls /tmp
      
    • Conceptual Explanation:
      • Your shell (PID 20000) calls fork(). A child shell process is created (e.g., PID 20050).
      • The parent shell (20000) waits.
      • The child shell (20050) calls execvp("ls", ["ls", "/tmp", NULL]). Its image is replaced by ls.
      • The ls program (still PID 20050) runs, lists /tmp, and then calls exit().
      • The parent shell (20000) receives the exit status from wait() and gives you the prompt back.
    • Observation: This happens too fast to easily observe with ps. This just illustrates the underlying mechanism for standard command execution.
  3. Run a Command in the Background:

    • Now, run a command that takes some time, like sleep, in the background:
      sleep 30 &
      
    • Output: The shell will likely print something like [1] 20055. This is the job number ([1]) and the PID of the child process (20055).
    • Conceptual Explanation:
      • Your shell (PID 20000) calls fork(). A child shell process (PID 20055) is created.
      • The parent shell (20000) does not wait because of the &. It immediately gives you the prompt back.
      • The child process (20055) calls execvp("sleep", ["sleep", "30", NULL]). Its image is replaced by sleep.
      • The sleep program (PID 20055) runs for 30 seconds. When it finishes, it exits. The shell might print a "Done" message later when it reaps the child.
  4. Observe the Background Process Parent:

    • While the sleep 30 command is still running (within 30 seconds), use ps to check its details, specifically its PPID. Use the PID reported in the previous step (20055 in our example):
      ps -p 20055 -o pid,ppid,state,cmd
      
    • Output:
        PID  PPID S CMD
      20055 20000 S sleep 30
      
      • Notice that the PPID is 20000, which is the PID of your interactive shell. This confirms the parent-child relationship established by fork().
  5. Create a Process Chain (Subshell):

    • Shells can also create subshells without necessarily executing an external command immediately. Parentheses () create a subshell.
    • Run the following:
      ( sleep 60 & )
      
    • Explanation:
      • The outer parentheses ( ... ) cause your main shell (20000) to fork() a child shell (let's say PID 20060).
      • This child shell (20060) then executes the command inside the parentheses: sleep 60 &.
      • To do this, the child shell (20060) itself calls fork() to create another child process (let's say PID 20061).
      • The parent subshell (20060) doesn't wait (due to &) and exits almost immediately.
      • The grandchild process (20061) calls execvp("sleep", ["sleep", "60", NULL]) and runs sleep 60.
    • Observation: Find the sleep 60 process. Since the intermediate shell (20060) likely exited very quickly, the sleep process (20061) might become orphaned.
      pgrep sleep # Find PIDs of all sleep processes
      # Let's assume pgrep returns 20061
      ps -p 20061 -o pid,ppid,state,cmd
      
    • Expected Output (Potentially):
        PID PPID S CMD
      20061     1 S sleep 60
      # OR, if the original shell adopted it somehow (less common for this simple case):
      # 20061 20000 S sleep 60
      
      • If the PPID is 1 (or the PID of systemd/init), it means the intermediate parent shell (20060) terminated, and the sleep process (20061) was orphaned and adopted by PID 1. This demonstrates orphan adoption.
  6. Clean Up:

    • You might still have sleep processes running. Find them and kill them:
      pkill sleep
      

Summary: In this workshop, you practiced launching commands in the background using &. By examining the PID and PPID with ps, you directly observed the parent-child relationship created by the shell's use of fork(). You also explored how subshells and backgrounding can lead to orphaned processes being adopted by PID 1, illustrating another key aspect of the process lifecycle and hierarchy. This reinforces the concepts of fork, exec, parent/child relationships, and process adoption.


4. Process Termination and Signals

Processes don't run forever; they eventually terminate. Termination can be voluntary (the process finishes its work) or involuntary (it's forced to stop). Signals are the primary mechanism used in Linux/Unix systems for communicating events or requests, including termination requests, between processes or between the kernel and a process.

Process Termination

  1. Normal Exit:

    • A process typically terminates normally by calling the _exit() or exit() system call (the standard C library exit() function performs cleanup like flushing I/O buffers before calling _exit()).
    • The process provides an exit status (an integer value between 0 and 255) to the kernel.
    • Convention: An exit status of 0 indicates success. A non-zero exit status indicates failure or a specific error condition.
    • The kernel stores this exit status in the terminated process's (now zombie) process table entry until the parent process retrieves it using a wait() family system call.
  2. Abnormal Termination (via Signals):

    • A process can be terminated prematurely if it receives a signal whose default action is to terminate.
    • Common terminating signals include SIGTERM, SIGKILL, SIGINT, SIGHUP, SIGQUIT, SIGSEGV.

Signals

Signals are asynchronous software interrupts delivered to a process. They notify a process that a particular event has occurred.

  • Sources of Signals:

    • Kernel: Notifying the process of hardware exceptions (e.g., SIGSEGV - segmentation fault/invalid memory access, SIGFPE - floating-point exception) or software events (e.g., SIGPIPE - writing to a pipe with no reader, SIGALRM - timer expired).
    • Other Processes: A process with appropriate permissions can explicitly send a signal to another process using the kill() system call (or commands like kill, pkill).
    • Terminal Driver: When you press certain key combinations in the terminal (e.g., Ctrl+C, Ctrl+Z), the terminal driver sends signals (SIGINT, SIGTSTP, respectively) to the foreground process group.
  • Signal Handling: When a signal is delivered to a process, the process can take one of three actions:

    1. Default Action: Every signal has a default action defined by the system. Common defaults include:
      • Terminate the process.
      • Terminate the process and dump core (create a file with the process's memory image for debugging).
      • Ignore the signal.
      • Stop the process.
      • Continue the process (if stopped).
    2. Catch the Signal: The process can register a specific function (a signal handler) to be executed when the signal arrives. This allows the process to perform custom actions, like gracefully shutting down, reloading configuration, or cleaning up resources before exiting.
    3. Ignore the Signal: The process can explicitly request that the signal be ignored.
  • Exceptions: Two signals, SIGKILL (9) and SIGSTOP (19), cannot be caught, blocked, or ignored by the process. They always perform their default action (terminate immediately and stop, respectively). This provides a reliable way for the kernel and system administrator to control processes.

Common Signals and Their Default Actions:

Signal Name Number Default Action Description / Common Use
SIGHUP 1 Terminate Hangup detected on controlling terminal or death of session leader. Often used to signal daemons to reload configuration.
SIGINT 2 Terminate Interrupt from keyboard (Ctrl+C). Request to interrupt/stop.
SIGQUIT 3 Terminate (Core) Quit from keyboard (Ctrl+\). Similar to SIGINT but dumps core.
SIGILL 4 Terminate (Core) Illegal Instruction. CPU detected an invalid instruction.
SIGABRT 6 Terminate (Core) Abort signal. Usually sent by a process to itself on assertion failure.
SIGFPE 8 Terminate (Core) Floating-Point Exception (e.g., division by zero).
SIGKILL 9 Terminate Kill signal. Cannot be caught or ignored. Forceful termination.
SIGUSR1 10 Terminate User-defined signal 1. Application specific use.
SIGUSR2 12 Terminate User-defined signal 2. Application specific use.
SIGSEGV 11 Terminate (Core) Segmentation Violation. Invalid memory reference.
SIGPIPE 13 Terminate Broken pipe: write to a pipe with no process reading from it.
SIGALRM 14 Terminate Alarm clock timer expired (set by alarm() or setitimer()).
SIGTERM 15 Terminate Termination signal. Standard, polite request to terminate. Can be caught/ignored. Preferred over SIGKILL.
SIGCHLD 17 Ignore Child process terminated, stopped, or continued.
SIGCONT 18 Continue/Ignore Continue if stopped, otherwise ignore.
SIGSTOP 19 Stop Stop signal. Cannot be caught or ignored. Pauses execution.
SIGTSTP 20 Stop Stop typed at terminal (Ctrl+Z). Can be caught/ignored.

You can get a full list on your system using kill -l or man 7 signal.

Commands for Sending Signals:

  1. kill <PID>

    • Sends a signal to the specified process ID.
    • By default (if no signal is specified), it sends SIGTERM (15).
    • Syntax: kill [-s <signal_name_or_number>] <PID> ... or kill -<signal_number> <PID> ...
    • Examples:
      • kill 12345: Sends SIGTERM to PID 12345 (polite request to terminate).
      • kill -9 12345: Sends SIGKILL to PID 12345 (forceful termination).
      • kill -SIGINT 12345: Sends SIGINT (same as Ctrl+C) to PID 12345.
      • kill -SIGHUP 12345: Sends SIGHUP (often for config reload) to PID 12345.
  2. pkill <pattern>

    • Sends a signal to processes whose name (or other attributes) matches the provided pattern.
    • Useful when you know the name of the process but not its exact PID, or want to signal multiple instances.
    • By default, sends SIGTERM.
    • Syntax: pkill [-signal] [-f] <pattern>
      • -f: Match the pattern against the full command line, not just the process name.
    • Examples:
      • pkill firefox: Sends SIGTERM to all processes named firefox.
      • pkill -9 troublesome_script.sh: Sends SIGKILL to processes named troublesome_script.sh.
      • pkill -f "python my_long_running_job.py": Sends SIGTERM to processes whose command line contains this string.
  3. killall <process_name>

    • Similar to pkill, but typically matches only the exact process name. Behavior can sometimes differ slightly between implementations.
    • By default, sends SIGTERM.
    • Syntax: killall [-signal] <process_name>
    • Example:
      • killall -SIGKILL nginx: Sends SIGKILL to all processes exactly named nginx.

Choosing Between SIGTERM and SIGKILL:

  • Always try SIGTERM (15) first. This gives the process a chance to shut down cleanly: save its state, close files, clean up temporary resources, and terminate gracefully. This is the polite way.
  • Use SIGKILL (9) only as a last resort if a process is unresponsive to SIGTERM or is causing severe problems. SIGKILL terminates the process immediately without giving it any chance to clean up. This can lead to data corruption, orphaned resources, or inconsistent application state. It's the "pull the plug" approach.

Workshop Sending Signals to Processes

In this workshop, you'll practice sending different signals (SIGTERM, SIGKILL, SIGSTOP, SIGCONT) to processes using the kill and pkill commands.

Objective: Observe the effects of different signals on a running process. Understand the difference between SIGTERM and SIGKILL. Practice using kill with PIDs and pkill with names.

Tools: sleep, yes, kill, pkill, ps, shell job control (Ctrl+Z, jobs, fg, bg).

Steps:

  1. Prepare Target Processes:

    • Open two separate terminal windows (or use terminal multiplexer panes like tmux).
    • In Terminal 1, start a simple sleep process in the background:
      # Terminal 1
      echo "My PID is $$" # Note your shell's PID
      sleep 1000 &
      jobs -l # Note the PID of the sleep process (e.g., 30010)
      
    • In Terminal 2, start a CPU-intensive yes process in the background:
      # Terminal 2
      echo "My PID is $$" # Note your shell's PID
      yes > /dev/null &
      jobs -l # Note the PID of the yes process (e.g., 31020)
      
  2. Send SIGTERM (Polite Termination):

    • From either terminal, send SIGTERM to the sleep process using its PID (replace 30010 with the actual PID):
      kill 30010
      # OR explicitly: kill -SIGTERM 30010 OR kill -15 30010
      
    • Switch back to Terminal 1 where sleep was started. You should see a message like [1]+ Terminated sleep 1000 after a brief moment.
    • Verify it's gone using ps or jobs -l in Terminal 1.
    • Explanation: sleep is designed to handle SIGTERM and exit cleanly.
  3. Send SIGKILL (Forceful Termination):

    • The yes process is likely still running and consuming CPU. Let's try SIGTERM first (using pkill this time for variety).
      pkill yes
      
    • Check if it terminated (use jobs -l in Terminal 2 or ps aux | grep yes). Some versions of yes might ignore SIGTERM or not terminate immediately.
    • If yes is still running, or to see the effect of SIGKILL, send SIGKILL:
      pkill -9 yes
      # OR: pkill -SIGKILL yes
      # OR find its PID (e.g., 31020) and use: kill -9 31020
      
    • Check again. The yes process should now be gone immediately.
    • Explanation: SIGKILL cannot be ignored and forces immediate termination by the kernel.
  4. Stop and Continue a Process:

    • Start another sleep process in Terminal 1:
      # Terminal 1
      sleep 1000 &
      jobs -l # Note the PID (e.g., 30050)
      
    • Send the SIGSTOP signal using kill (remember, SIGSTOP cannot be caught):
      kill -SIGSTOP 30050
      # OR: kill -19 30050
      
    • Check its status in Terminal 1:
      # Terminal 1
      jobs -l
      
      • Output: You should see the state indicated as Stopped.
    • Verify with ps:
      ps -p 30050 -o pid,state,cmd
      
      • Output: Should show state T.
    • Now, send the SIGCONT signal to resume it:
      kill -SIGCONT 30050
      # OR: kill -18 30050
      
    • Check its status again in Terminal 1:
      # Terminal 1
      jobs -l
      
      • Output: The state should now be Running (meaning it's back in the background, likely sleeping).
    • Verify with ps:
      ps -p 30050 -o pid,state,cmd
      
      • Output: Should show state S.
    • Clean up:
      kill 30050 # Send SIGTERM
      
  5. Interactive Stop (Ctrl+Z) and SIGINT (Ctrl+C):

    • Start a foreground process in Terminal 1:
      # Terminal 1
      sleep 500
      
    • Press Ctrl+Z. This sends SIGTSTP (Terminal Stop). The shell should indicate the job is stopped, and you get your prompt back. Check with jobs -l.
    • Bring it back to the foreground: fg %<job_number>.
    • Now, press Ctrl+C. This sends SIGINT (Interrupt). The sleep command should terminate, and you get your prompt back.
    • Explanation: This demonstrates how terminal control keys map to specific signals (SIGTSTP, SIGINT) sent to the foreground process.

Summary: In this workshop, you used kill and pkill to send signals (SIGTERM, SIGKILL, SIGSTOP, SIGCONT) to processes, observing their different effects. You saw the importance of trying SIGTERM before resorting to SIGKILL. You also practiced stopping and resuming processes using signals and related this to the familiar Ctrl+Z and Ctrl+C terminal actions. This practical experience solidifies your understanding of signal handling and process control.


5. Process Scheduling and Priorities

In any multitasking operating system like Linux, there are usually more runnable processes than available CPU cores. The scheduler is the core component of the kernel responsible for deciding which runnable process gets to use a CPU core, when, and for how long. Process scheduling aims to balance competing goals like fairness, responsiveness (for interactive tasks), and throughput (for batch tasks).

The Scheduler's Role

  • Multiplexing: Share the CPU(s) among multiple processes over time, creating the illusion of simultaneous execution.
  • Resource Allocation: Decide which process gets the valuable CPU resource next.
  • Context Switching: When the scheduler decides to switch from one process (A) to another (B), it needs to:
    1. Save the complete execution context of process A (CPU registers, program counter, etc.).
    2. Load the previously saved execution context of process B.
    3. Resume execution of process B. This context switch has overhead and takes a small amount of time.

Scheduling Algorithms

Linux has employed various scheduling algorithms over its history. Modern kernels typically use sophisticated algorithms like the Completely Fair Scheduler (CFS) for normal (non-real-time) processes.

  • CFS Goal: To distribute processor time fairly among runnable processes. It tries to give each process an "ideal" share of the CPU based on its priority (nice value). It uses a red-black tree data structure to efficiently find the process that has run the least relative to its fair share.
  • Time Slices: Instead of fixed time slices, CFS assigns processes CPU time proportionally based on their weight (derived from their nice value). Processes that have received less than their fair share get priority.
  • Real-time Scheduling: Linux also supports real-time scheduling policies (SCHED_FIFO, SCHED_RR) for processes with strict timing requirements. These have higher priority than normal CFS processes and are scheduled differently (e.g., FIFO runs until it blocks or yields, RR runs for a fixed time slice before potentially yielding to another RR process of the same priority).

Process Priorities

Not all processes are equally important. Linux allows influencing the scheduler's decisions through process priorities. There are two main types:

  1. Static Priority (Real-time):

    • Used for real-time processes (SCHED_FIFO, SCHED_RR).
    • Ranges from 0 to 99 (higher value = higher priority).
    • Real-time processes always run before any normal CFS processes if they are runnable.
    • Requires special privileges (root or CAP_SYS_NICE capability) to set.
  2. Dynamic Priority (Normal Processes / CFS):

    • Used for regular user processes (SCHED_NORMAL, SCHED_BATCH, SCHED_IDLE).
    • Influenced by the nice value.
    • Nice Value: An integer ranging from -20 to +19.
      • -20: Highest priority (most "nice" to other processes, meaning it gets more CPU time relative to others). Requires root privileges to set negative values.
      • 0: Default priority for processes started by normal users.
      • +19: Lowest priority (most "nice" to the system, gets less CPU time when there's contention). Any user can increase the niceness (lower the priority) of their own processes.
    • The kernel uses the nice value to calculate the process's weight within CFS. A lower nice value means a higher weight, resulting in a larger proportional share of the CPU time over the long run. It's a relative priority among CFS processes.

CPU Bound vs. I/O Bound Processes

The scheduler often tries to differentiate between two types of processes:

  • CPU-Bound: Processes that spend most of their time performing computations and rarely block for I/O. Examples: complex simulations, video encoding, data analysis loops. CFS tries to give them fair CPU time but doesn't necessarily need to prioritize them for responsiveness.
  • I/O-Bound: Processes that spend most of their time waiting for I/O operations (disk reads/writes, network activity, terminal input). Examples: text editors, web servers waiting for requests, shells waiting for commands. They run in short bursts between I/O waits. The scheduler often tries to prioritize I/O-bound processes when they become runnable (after I/O completes) to improve system responsiveness and keep interactive applications feeling quick.

Commands for Managing Priorities (nice and renice)

  1. nice

    • Purpose: Launch a new command with a specified niceness level (priority).
    • Syntax: nice -n <niceness_level> <command> [arguments...]
      • <niceness_level>: An integer, typically from -20 to 19. If omitted, defaults to adding 10 to the parent's niceness (making it lower priority).
      • Only root can specify negative increments (higher priority). Normal users can only specify positive increments (lower priority).
    • Examples:
      • nice -n 15 ./my_batch_job.sh: Start the script with a low priority (niceness 15).
      • nice ./my_script.sh: Start the script with niceness 10 (if parent was 0).
      • sudo nice -n -5 ./important_task: Start the task with higher priority (niceness -5).
  2. renice

    • Purpose: Change the niceness level of already running processes.
    • Syntax: renice [-n] <priority> [-p <PID>...] [-g <PGRP>...] [-u <user>...]
      • <priority>: The absolute niceness value to set (from -20 to 19).
      • -p <PID>: Specify process IDs to change.
      • -g <PGRP>: Specify process group IDs to change.
      • -u <user>: Specify username or UID; change priority for all processes owned by that user.
      • The -n is often optional when setting the absolute priority.
    • Permissions:
      • A user can only renice processes they own, and can only increase the niceness value (lower the priority).
      • Root can renice any process to any valid niceness value.
    • Examples:
      • renice 10 -p 12345: Set the niceness of PID 12345 to 10 (lower priority). (Requires ownership or root).
      • sudo renice -5 -p 12345: Set the niceness of PID 12345 to -5 (higher priority). (Requires root).
      • renice 19 -u bob: Set the niceness of all of user bob's processes to 19 (lowest priority). (Requires root or being user bob).

Understanding scheduling and priorities helps you manage long-running tasks, ensure interactive performance, and optimize resource allocation on your Linux system.

Workshop Adjusting Process Priorities

In this workshop, you will launch CPU-intensive processes and use nice and renice to observe the effect of changing priorities on CPU allocation, as viewed through top or htop.

Objective: Understand how to launch processes with non-default priorities using nice and modify the priority of running processes using renice. Observe the impact on CPU usage in a multi-process environment.

Tools: yes, nice, renice, top or htop, pkill.

Prerequisites: A system with at least two CPU cores is helpful to clearly see the effects, but it will also work (though less dramatically) on a single-core system.

Steps:

  1. Start Baseline CPU Load:

    • Open a terminal (Terminal 1).
    • Start a CPU-bound process with default priority (niceness 0):
      # Terminal 1
      yes > /dev/null &
      jobs -l # Note the PID (e.g., 40010)
      
  2. Monitor with top or htop:

    • Open another terminal (Terminal 2).
    • Start htop (preferred) or top:
      # Terminal 2
      htop # or top
      
    • Observe the yes process. It should be consuming close to 100% of one CPU core. Note its NI (Nice) value (should be 0) and %CPU.
  3. Start a Second CPU Load (Default Priority):

    • Go back to Terminal 1.
    • Start a second yes process, also with default priority:
      # Terminal 1
      yes > /dev/null &
      jobs -l # Note the PID (e.g., 40020)
      
    • Go back to Terminal 2 (htop/top).
    • Observe the two yes processes. They should now be sharing the CPU time relatively equally (each getting around 50% on a single core being shared, or each consuming close to 100% if you have 2+ cores available for them). Both should have NI = 0.
  4. Start a Third CPU Load with Lower Priority (nice):

    • Go back to Terminal 1.
    • Start a third yes process, but this time give it a lower priority (higher niceness) using the nice command:
      # Terminal 1
      nice -n 10 yes > /dev/null &
      jobs -l # Note the PID (e.g., 40030)
      
    • Go back to Terminal 2 (htop/top).
    • Observe the three yes processes.
      • The first two (NI=0) should still be getting a significantly larger share of the CPU time compared to the third one.
      • The third yes process should have NI = 10 and a noticeably lower %CPU value.
      • Explanation: The scheduler (CFS) sees that the third process has a lower weight due to its higher niceness value and allocates it proportionally less CPU time when the other higher-priority processes are also runnable.
  5. Change Priority of a Running Process (renice):

    • Let's make the first yes process (PID 40010, NI=0) even less important than the third one. We will increase its niceness value using renice.
    • From Terminal 1 (or any terminal):
      renice 15 -p 40010
      
      • You might see output like: 40010 (process ID) old priority 0, new priority 15
    • Go back to Terminal 2 (htop/top).
    • Observe the processes again:
      • The second yes process (PID 40020, NI=0) should now be getting the largest share of CPU time.
      • The third yes process (PID 40030, NI=10) should get the next largest share.
      • The first yes process (PID 40010), which we just reniced, should now have NI = 15 and the smallest share of CPU time.
  6. (Optional - Requires Root) Give a Process Higher Priority (renice):

    • Let's make the third process (PID 40030, currently NI=10) the highest priority. This requires root privileges.
    • From Terminal 1 (or any terminal):
      sudo renice -5 -p 40030
      
      • You'll be prompted for your password.
    • Go back to Terminal 2 (htop/top).
    • Observe:
      • The third yes process (PID 40030) should now have NI = -5 and should be consuming the most CPU time, significantly more than the others.
      • The second process (PID 40020, NI=0) gets the next share.
      • The first process (PID 40010, NI=15) gets the least.
  7. Clean Up:

    • Stop all the yes processes using pkill:
      # In Terminal 1 or any terminal
      pkill yes
      
    • Close htop/top (press q).

Summary: This workshop demonstrated the practical use of nice to start processes with altered priority and renice to change the priority of running processes. By observing CPU usage in htop/top, you saw how the Linux scheduler allocates CPU time based on niceness values, giving preference to processes with lower niceness (higher priority). You also saw that setting higher priorities (negative niceness) requires root privileges.


6. Background and Foreground Processes (Job Control)

When you work in an interactive shell (like bash or zsh), you often run commands one after another. Typically, the shell waits for one command to finish before prompting you for the next. This is running a command in the foreground. However, shells also provide job control features that allow you to manage multiple processes concurrently, switching them between the foreground and background, and suspending/resuming them.

Foreground vs. Background

  • Foreground Process: A process running in the foreground has control of the terminal.

    • It can read input directly from the keyboard.
    • Its standard output and standard error are typically connected to the terminal display.
    • The shell usually waits for the foreground process to complete before issuing a new prompt.
    • It receives terminal-generated signals like SIGINT (Ctrl+C) and SIGTSTP (Ctrl+Z).
    • There can only be one foreground process group associated with a terminal at any time.
  • Background Process: A process running in the background does not have control of the terminal.

    • It cannot directly read input from the keyboard (attempting to do so usually causes it to be stopped by a SIGTTIN signal).
    • Its standard output and standard error are still typically connected to the terminal (unless redirected), so you might see its output interspersed with your foreground commands.
    • The shell does not wait for background processes to complete; it immediately gives you a new prompt.
    • It does not receive keyboard-generated signals like SIGINT or SIGTSTP.

Shell Job Control Commands

Most modern shells provide built-in commands for managing jobs (a job is essentially a pipeline of one or more processes started from the shell):

  1. & (Ampersand)

    • Purpose: Place this character at the end of a command line to start the command (or pipeline) as a background job.
    • Example:
      sleep 300 &
      # Shell prints job number and PID, then returns prompt immediately
      # [1] 50010
      
  2. jobs

    • Purpose: List the active jobs (backgrounded or stopped) associated with the current shell session.
    • Output: Shows job number (e.g., [1]), state (Running, Stopped), and the command.
    • Options:
      • jobs -l: Also display the Process ID (PID) of the job's process group leader.
      • jobs -p: List only the PIDs of the process group leaders.
    • Example:
      jobs -l
      # [1]+ 50010 Running                 sleep 300 &
      # [2]- 50015 Stopped                 vim my_file.txt
      
      • The + indicates the "current" job (default for fg/bg).
      • The - indicates the "previous" job.
  3. Ctrl+Z

    • Purpose: Suspend (stop) the current foreground process.
    • Action: Sends the SIGTSTP signal to the foreground process group. The process pauses execution, and the shell regains control of the terminal, usually printing the job number and "Stopped".
    • Example: Run sleep 300, then press Ctrl+Z.
  4. bg (Background)

    • Purpose: Resume a stopped job and run it in the background.
    • Syntax: bg [%<job_number>]
      • If %<job_number> is omitted, it usually acts on the "current" job (marked with + by jobs). You can specify a job using its number (e.g., %1, %2).
    • Example: After stopping sleep 300 with Ctrl+Z (let's say it's job [1]), run:
      bg %1
      # Shell usually prints: [1]+ sleep 300 &
      
      Now sleep 300 is running in the background.
  5. fg (Foreground)

    • Purpose: Bring a backgrounded or stopped job into the foreground.
    • Syntax: fg [%<job_number>]
      • If %<job_number> is omitted, it usually acts on the "current" job (+).
    • Action: The shell gives control of the terminal back to the specified job. The shell waits for this job to complete or be suspended again.
    • Example: If sleep 300 is running in the background as job [1]:
      fg %1
      # The sleep 300 command now occupies the terminal
      # Press Ctrl+C to terminate it or Ctrl+Z to stop it again
      

Job Specifiers (% Notation):

  • %N: Job number N (e.g., %1).
  • %string: Job whose command starts with string (e.g., %sleep).
  • %?string: Job whose command contains string.
  • %% or %+: The current job (marked with +).
  • %-: The previous job (marked with -).

Detaching Processes Completely: nohup and disown

What happens to background processes started with & if you close the terminal or log out? By default, the shell often sends a SIGHUP (Hangup) signal to its background jobs, which usually causes them to terminate. To keep a process running even after you log out, you need to detach it more thoroughly:

  1. nohup <command> &

    • Purpose: Run a command immune to hangups (SIGHUP) and redirect its output.
    • Action:
      • Prevents the SIGHUP signal from terminating the command when the terminal closes.
      • Redirects standard output and standard error to a file named nohup.out in the current directory (or $HOME/nohup.out if the current directory isn't writable), unless already redirected elsewhere.
      • Runs the command in the background implicitly (though adding & is common practice and ensures the shell gives the prompt back immediately).
    • Example:
      nohup ./my_long_script.sh > script.log 2>&1 &
      # This runs the script, immune to hangups.
      # stdout goes to script.log. stderr is redirected to stdout (so it also goes to script.log).
      # The final & puts it in the background of the current shell.
      
  2. disown (Bash/Zsh built-in)

    • Purpose: Remove a job from the shell's active job table. This prevents the shell from sending SIGHUP to it on exit.
    • Syntax: disown [-h] [%<job_id>]
      • If no job ID is given, acts on the current job.
      • -h: Mark the job so SIGHUP is not sent, but keep it in the job table (less common). Without -h, it's removed entirely.
    • Usage: Often used after starting a job in the background (&) or stopping (Ctrl+Z) and backgrounding (bg).
    • Example:
      ./my_other_script.sh &
      jobs -l # Find its job number, e.g., [1]
      disown %1
      # Now, closing the shell will not send SIGHUP to my_other_script.sh
      

Advanced Session Management: screen and tmux

For managing long-running processes, multiple shells, and sessions that persist even if your network connection drops, terminal multiplexers like screen and tmux are invaluable tools. They allow you to:

  • Create persistent sessions on a remote server.
  • Detach from a session, log out, log back in later, and reattach to the session exactly as you left it (including running processes).
  • Have multiple virtual "windows" (like tabs) and "panes" (split views) within a single terminal connection.

While a deep dive into screen or tmux is beyond the scope of this section, they are the standard solution for robustly managing processes that need to outlive your terminal login session. nohup and disown are simpler mechanisms suitable for specific cases.

Workshop Managing Foreground and Background Jobs

This workshop provides hands-on practice with shell job control features.

Objective: Learn to start jobs in the background, stop foreground jobs, list active jobs, and switch jobs between foreground and background. Practice using nohup.

Tools: Shell (bash, zsh, etc.), sleep, vim (or nano, gedit - any simple text editor), jobs, fg, bg, nohup, pkill.

Steps:

  1. Start a Background Job:

    • Open a terminal.
    • Start a sleep command in the background:
      sleep 600 &
      
    • Note the job number and PID printed by the shell.
    • Verify it's running in the background using jobs:
      jobs -l
      
      • You should see the sleep command listed as Running.
  2. Start and Stop a Foreground Job:

    • Start a simple text editor (like vim or nano) with a dummy filename. This will run in the foreground.
      vim temporary_file.txt
      # Or: nano temporary_file.txt
      
    • Your terminal is now controlled by the editor.
    • Suspend the editor: Press Ctrl+Z.
    • The shell should print a message like [2]+ Stopped vim temporary_file.txt and give you the prompt back.
    • List the jobs again:
      jobs -l
      
      • You should now see two jobs: the sleep job (Running) and the editor job (Stopped). Note which one has the + (current job).
  3. Resume Stopped Job in Background (bg):

    • The editor job is currently stopped. Resume it in the background:
      bg %2 # Replace %2 if the editor job has a different number
      # Or, if it's the current job (+): bg
      
    • The shell should indicate the job is now running in the background (e.g., [2]+ vim temporary_file.txt &).
    • Check jobs again:
      jobs -l
      
      • Both jobs (sleep and vim) should now be listed as Running.
  4. Bring Job to Foreground (fg):

    • Bring the backgrounded editor job back to the foreground:
      fg %2 # Use the correct job number for the editor
      # Or, if it became the current job (+): fg
      
    • You are now back inside the editor.
    • Exit the editor cleanly (e.g., in vim, type :q! and press Enter; in nano, press Ctrl+X).
    • Check jobs again. The editor job should be gone.
  5. Bring Another Job to Foreground and Terminate:

    • The sleep 600 job (likely job %1) is still running in the background.
    • Bring it to the foreground:
      fg %1
      
    • The terminal will now seem to hang (it's sleeping).
    • Terminate the foreground process: Press Ctrl+C (sends SIGINT).
    • You should get your shell prompt back. Check jobs – it should be empty.
  6. Using nohup:

    • Let's simulate a long-running script that should continue even if we close the terminal.
    • Run sleep using nohup:
      nohup sleep 300 &
      
    • The shell will print the PID and a message like nohup: ignoring input and appending output to 'nohup.out'.
    • Check for the nohup.out file: ls nohup.out. It should exist (it will likely be empty as sleep produces no output).
    • Check jobs. The nohup sleep 300 command is listed as a background job.
    • Simulate Logout (Conceptual): If you were to close this terminal window now, the sleep 300 process started with nohup would not receive SIGHUP and would continue running until its 300 seconds are up. (We won't actually close the terminal here).
    • Clean up: Since it's still associated with your shell as a job, you can kill it using its job specifier or PID.
      jobs -l # Find the PID or job number
      kill %<job_number> # Or kill <PID>
      rm nohup.out # Remove the output file
      

Summary: In this workshop, you practiced the core shell job control features: starting jobs in the background (&), listing jobs (jobs), suspending foreground jobs (Ctrl+Z), resuming jobs in the background (bg), and bringing jobs to the foreground (fg). You also learned how to use nohup to make a command immune to hangups, allowing it to potentially run after terminal closure. This gives you powerful tools for managing multiple tasks within a single shell session.


7. Inter-Process Communication (IPC) Overview

Processes often need to coordinate their actions or exchange data with each other. Inter-Process Communication (IPC) refers to the mechanisms provided by the operating system that allow processes to communicate and synchronize. Linux offers a rich set of IPC mechanisms, ranging from simple to complex, each suited for different needs.

Why IPC?

Processes run in their own protected virtual address spaces. One process cannot directly access the memory of another process for security and stability reasons. IPC mechanisms provide controlled ways to bridge this gap. Common reasons for using IPC include:

  • Data Exchange: Sending data from one process to another (e.g., output of one command feeding into another).
  • Notification: Informing another process that an event has occurred.
  • Synchronization: Coordinating access to shared resources to prevent race conditions or ensure tasks happen in the correct order.
  • Resource Sharing: Allowing multiple processes to access a shared resource (like shared memory).
  • Client-Server Communication: Enabling request-response interactions (e.g., a web server communicating with worker processes).

Common Linux IPC Mechanisms (Overview)

Here's a brief overview of some key IPC methods available in Linux. Note that each of these could be a topic for a much deeper dive, especially concerning their programming interfaces.

  1. Pipes (Unnamed Pipes)

    • Concept: A unidirectional channel for communication between related processes (typically parent and child). Data written to one end of the pipe by one process can be read from the other end by the other process.
    • Characteristics:
      • Unidirectional (one-way data flow).
      • Used between related processes (created via pipe() system call before fork()).
      • Kernel buffers data; acts like a producer-consumer queue.
      • Implicitly used by shells for pipelines (e.g., ls -l | grep .txt).
    • Use Cases: Simple data streaming between parent/child or sibling processes. Shell pipelines.
  2. FIFOs (Named Pipes)

    • Concept: Similar to pipes but exist as special files within the filesystem hierarchy. This allows unrelated processes to communicate through them by opening the FIFO file.
    • Characteristics:
      • Unidirectional (though two can be used for bidirectional).
      • Accessed via a filesystem path (mkfifo command or mkfifo() system call).
      • Allows communication between any two processes that have permission to access the FIFO file.
      • Data is still buffered by the kernel.
    • Use Cases: Communication between unrelated processes on the same machine, simple client-server setups where processes know the FIFO path.
  3. Signals

    • Concept: As discussed previously (Section 4), signals are asynchronous notifications sent to processes.
    • Characteristics:
      • Primarily for notification, not bulk data transfer (though some limited data can be sent with real-time signals).
      • Can be sent between any two processes with appropriate permissions (kill() system call).
      • Limited number of predefined signals, plus user-defined ones (SIGUSR1, SIGUSR2).
    • Use Cases: Notifying processes of events (termination request, child exit, timer expiry, configuration reload), basic synchronization.
  4. System V IPC (Older, but still widely used)

    • A suite of IPC mechanisms originating from UNIX System V:
      • Message Queues (msgget, msgsnd, msgrcv, msgctl)
        • Allows processes to exchange formatted messages via a kernel-managed queue.
        • Processes identify queues by a unique key or ID.
        • Messages can have types, allowing selective retrieval.
        • Persistent (queues exist until explicitly removed or system reboot).
      • Semaphores (semget, semop, semctl)
        • Used for synchronization. A semaphore is essentially a counter used to control access to shared resources.
        • Processes can perform atomic operations (like decrementing - wait/P, or incrementing - signal/V) on semaphore values.
        • Used to implement mutexes (mutual exclusion locks), enforce resource limits, etc.
        • Persistent.
      • Shared Memory (shmget, shmat, shmdt, shmctl)
        • The fastest form of IPC for bulk data transfer.
        • Allows multiple processes to map the same region of physical memory into their virtual address spaces.
        • Processes can read and write to this shared region directly, without kernel mediation for each access.
        • Requires external synchronization (like semaphores) to coordinate access and prevent corruption.
        • Persistent.
    • Administration: Commands like ipcs (list IPC objects) and ipcrm (remove IPC objects) are used to manage System V IPC resources.
  5. POSIX IPC (More modern, generally preferred over System V for new development)

    • An alternative, standardized set of IPC APIs:
      • POSIX Message Queues (mq_open, mq_send, mq_receive, mq_close, mq_unlink)
        • Similar in concept to System V message queues but with a different API based on file descriptors.
        • Often considered easier to use.
        • Messages can have priorities.
      • POSIX Semaphores (sem_open, sem_wait, sem_post, sem_close, sem_unlink)
        • Can be named (like FIFOs, accessible via /name) or unnamed (memory-based, for related processes/threads).
        • Simpler API than System V semaphores for basic locking.
      • POSIX Shared Memory (shm_open, mmap, munmap, shm_unlink, ftruncate)
        • Uses file descriptors to refer to shared memory objects, often mapped into the virtual filesystem (/dev/shm).
        • Uses standard mmap() call to map the object into the process's address space.
        • Considered more flexible and integrated with filesystem concepts than System V shared memory. Still requires external synchronization.
  6. Sockets (Network and Unix Domain)

    • Concept: Provide a general-purpose endpoint for communication. Most commonly associated with network communication (using TCP/IP or UDP), but also usable for IPC on a single machine.
    • Characteristics:
      • Network Sockets (AF_INET, AF_INET6): Allow communication between processes on different machines across a network, or on the same machine via the loopback interface (127.0.0.1). Uses standard networking protocols.
      • Unix Domain Sockets (AF_UNIX / AF_LOCAL): Allow communication between processes on the same machine using a filesystem path as the socket address. More efficient than network sockets for local IPC as it avoids network stack overhead. Supports stream (like TCP) and datagram (like UDP) communication. File system permissions control access.
    • Use Cases: Network client-server applications, local client-server applications (often via Unix domain sockets for performance), flexible bidirectional communication.

Choosing the right IPC mechanism depends heavily on the specific requirements: are the processes related? Do they run on the same machine? Is performance critical? Is simple notification enough, or is bulk data transfer needed? Do you need synchronization?

Workshop Using a Simple Pipe (|)

The most common and visible form of IPC for shell users is the pipe (|), used to connect the standard output of one command to the standard input of another. This workshop demonstrates this fundamental IPC mechanism.

Objective: Understand how the shell pipe (|) facilitates communication between two unrelated processes by connecting stdout to stdin.

Tools: Shell (bash, zsh, etc.), ls, grep, wc.

Steps:

  1. Command 1: Generate Output (ls)

    • Run the ls command to list files in a directory, for example, your home directory or /etc. Let's use /etc as it usually has many files.
      ls -l /etc
      
    • Observe the output: A multi-line listing of files and directories with details. This output is being written to the standard output stream (stdout) of the ls process, which by default is connected to your terminal.
  2. Command 2: Filter Input (grep)

    • Run the grep command to search for lines containing a specific pattern. Let's search for files related to ssh:
      grep ssh
      
    • Now, grep is waiting for input on its standard input stream (stdin). Type some lines, including one with "ssh", and press Ctrl+D to signal end-of-input.
      # Type these lines:
      hello world
      this has ssh config
      goodbye
      # Press Ctrl+D
      
    • grep will print the line containing "ssh". This shows grep reads from stdin.
  3. Connecting Commands with a Pipe (|)

    • Now, let's use the pipe (|) operator to connect the stdout of ls -l /etc directly to the stdin of grep ssh.
      ls -l /etc | grep ssh
      
    • Explanation:
      • The shell sees the |.
      • It starts the ls -l /etc process.
      • It starts the grep ssh process.
      • Crucially, instead of connecting ls's stdout to the terminal and grep's stdin to the keyboard, the shell sets up an unnamed pipe (an IPC mechanism managed by the kernel).
      • It redirects ls's stdout to write into the pipe.
      • It redirects grep's stdin to read from the pipe.
      • ls runs, writing its output line by line into the pipe.
      • grep runs concurrently, reading lines from the pipe as they become available. If a line contains "ssh", grep writes that line to its stdout (which is connected to your terminal).
      • When ls finishes and closes its end of the pipe, grep detects end-of-input and terminates.
    • Output: You will only see the lines from the /etc directory listing that contain the pattern "ssh".
  4. Chaining Multiple Pipes:

    • Pipes can be chained together. Let's count how many configuration files (ending in .conf) are in /etc:
      ls /etc | grep '\.conf$' | wc -l
      
    • Explanation:
      • ls /etc: Lists files/directories in /etc, one per line (without -l, ls often outputs differently when stdout is not a terminal). Output goes to Pipe 1.
      • grep '\.conf$': Reads from Pipe 1. Filters for lines ending ($) with .conf (dot . is escaped \ to match literally). Output goes to Pipe 2.
      • wc -l: Reads from Pipe 2. Counts the number of lines (-l). Output goes to wc's stdout (the terminal).
    • Output: A single number representing the count of files ending in .conf.
    • IPC: This involves three processes (ls, grep, wc) communicating sequentially via two kernel-managed pipes.

Summary: This workshop illustrated the use of unnamed pipes (|) as a fundamental IPC mechanism in the shell. You saw how the pipe connects the standard output of one process to the standard input of another, allowing data to flow between them without intermediate files. This powerful concept enables the creation of complex command pipelines by composing simple, single-purpose tools. While simple, it clearly demonstrates the core idea of IPC: enabling separate processes to exchange data.


Conclusion

Process management is a cornerstone of the Linux operating system. We've journeyed from the basic definition of a process – a program in execution with its associated resources and context – through its lifecycle and various states (Running, Sleeping, Stopped, Zombie). You learned how to use essential tools like ps, top, and htop to view, monitor, and analyze processes, examining their PIDs, PPIDs, states, and resource consumption.

We delved into the fundamental Linux mechanism for creating new tasks: the elegant fork() and exec() system call combination, understanding how parent-child relationships are formed and how new programs are loaded. We explored the critical role of signals for communication and control, learning how to terminate processes politely (SIGTERM) or forcefully (SIGKILL) using commands like kill and pkill, and how to suspend and resume them (SIGSTOP, SIGCONT).

Furthermore, we touched upon process scheduling, the kernel's mechanism for fairly distributing CPU time, and how user-space tools like nice and renice allow influencing process priorities. We practiced managing foreground and background jobs using shell job control (&, jobs, fg, bg, Ctrl+Z) and learned how nohup helps processes survive terminal closure.

Finally, we briefly surveyed the landscape of Inter-Process Communication (IPC), recognizing the need for processes to communicate and synchronize, and identifying mechanisms like pipes, FIFOs, signals, System V IPC, POSIX IPC, and sockets as the tools Linux provides to bridge the isolation between processes. The hands-on workshop using shell pipes provided a tangible example of IPC in action.

Mastering process management is essential for anyone serious about using Linux effectively. It empowers you to diagnose performance issues, manage system resources efficiently, control application execution, understand system behavior, and build more complex, coordinated applications. The concepts and tools covered here provide a solid foundation for further exploration into system administration, performance tuning, and systems programming on Linux. ```