Author | PayPal Me | |
---|---|---|
Nejat Hakan | nejat.hakan@outlook.de | https://paypal.me/nejathakan |
Python
This section provides a comprehensive guide to learning and using Python specifically within the Linux environment. We will cover everything from the absolute basics to more advanced topics, equipping you with the skills to leverage Python for scripting, automation, web development, data analysis, and more on your Linux system. Each theoretical sub-section is paired with a practical workshop to solidify your understanding through hands-on experience.
Introduction Getting Started with Python on Linux
Why Python?
Python stands out as one of the most popular and versatile programming languages in the world today, and it holds a particularly special place within the Linux ecosystem. Its popularity stems from several key factors:
- Readability and Simplicity: Python's syntax is designed to be clean, readable, and intuitive, resembling plain English in many aspects. This lowers the barrier to entry for beginners and makes code maintenance easier for experienced developers. The emphasis on indentation, rather than braces or keywords, enforces a consistent coding style. Code written by different developers often looks remarkably similar, aiding collaboration.
- Vast Standard Library: Python comes with a "batteries-included" philosophy, meaning its standard library provides modules and functions for a wide array of common tasks. Need to work with text (strings, regular expressions)? Interact with the operating system (files, processes)? Handle networking (sockets, HTTP)? Parse data formats (JSON, CSV, XML)? Manage dates and times? The standard library likely has modules to help, reducing the reliance on external packages for core functionality.
- Extensive Ecosystem (PyPI): Beyond the standard library, the Python Package Index (PyPI) hosts hundreds of thousands of third-party packages created by the community. This vast ecosystem is arguably Python's greatest strength. Whether you need to build web applications (Django, Flask, FastAPI), perform complex numerical computations (NumPy, SciPy), manipulate and analyze data (Pandas), implement machine learning algorithms (Scikit-learn, TensorFlow, PyTorch), create graphical interfaces (PyQt, Kivy), automate browser tasks (Selenium, Playwright), or interact with specific hardware or cloud APIs, chances are there's a well-supported Python library available. This dramatically accelerates development time.
- Cross-Platform Compatibility: Python is designed to be portable. Code written on Linux typically runs with little or no modification on macOS and Windows, and vice-versa (assuming platform-specific features aren't used or are handled appropriately). This makes it an excellent choice for developing applications targeting multiple operating systems.
- Strong Community Support: Python boasts one of the largest, most active, and welcoming developer communities globally. This translates into an abundance of learning resources (official documentation, tutorials, books, courses), active forums (Stack Overflow, Reddit), mailing lists, and conferences. Getting help when you encounter problems is usually straightforward.
- Integration with Linux: Python is deeply interwoven with most Linux distributions. It's frequently used for system administration scripts, configuration management tools (like Ansible's core), build systems, and even components of desktop environments. Many core Linux utilities and libraries offer Python bindings (APIs), making it a natural choice for automating system tasks, managing files, controlling processes, and interacting directly with the operating system's capabilities.
Python 2 vs. Python 3
Understanding the distinction between Python 2 and Python 3 is crucial, although primarily historical now. Python 3 was released in 2008 as a successor to Python 2, introducing significant improvements and changes, some of which were intentionally backward-incompatible.
- End of Life (EOL): Python 2 reached its official end-of-life on January 1, 2020. It no longer receives updates, including critical security patches. Using Python 2 is strongly discouraged and poses security risks.
- Key Differences:
print
: Python 2 used theprint
statement (print "Hello"
), while Python 3 uses theprint()
function (print("Hello")
).- Integer Division: Python 2's
/
operator performed floor division for integers (3 / 2
resulted in1
), while Python 3's/
performs true division (3 / 2
results in1.5
). Python 3 uses//
for explicit floor division (3 // 2
results in1
). - Unicode: Python 3 treats text strings as Unicode by default (
str
type), simplifying international text handling. Python 2 had separatestr
(bytes) andunicode
types, often requiring explicit encoding/decoding (u'...'
). range
vsxrange
: Python 3'srange()
behaves like Python 2'sxrange()
, generating numbers lazily (memory efficient). Python 2'srange()
created a full list in memory.- Error Handling: Exception syntax changed slightly (
except Exception, e:
in Py2 vs.except Exception as e:
in Py3). - Standard Library Reorganization: Some modules were renamed or moved (e.g.,
urllib2
integrated intourllib
in Py3).
Recommendation: Always use Python 3 for new projects. All modern Linux distributions ship with Python 3 as the default or primary Python interpreter. The vast majority of libraries are Python 3 compatible, and many new libraries are Python 3 only. If you encounter legacy Python 2 code, prioritize migrating it to Python 3.
Checking Your Python Installation on Linux
Most modern Linux distributions come with Python 3 pre-installed. You can easily verify this and check the version by opening your terminal (e.g., GNOME Terminal, Konsole, xterm).
-
Check for Python 3:
If installed, this command will output the version number (e.g.,Python 3.10.6
). This is the most reliable command to invoke Python 3. -
Check the generic
On many modern distributions,python
command:python
is now a symbolic link topython3
. On older systems, or systems configured differently, it might link to Python 2 or might not exist at all. If it links to Python 2, avoid usingpython
for new development.
Interpreting the Output:
- If
python3 --version
works, you have Python 3 installed. Note the version number (e.g., 3.x.y). - If
python3 --version
fails (e.g., "command not found"), you need to install Python 3. - Check the output of
python --version
to see if it points to Python 3 or Python 2, or if it's missing.
Installing Python 3 (if necessary):
Use your distribution's package manager. Open a terminal and run the appropriate command (you might need sudo
):
- Debian/Ubuntu/Mint:
- Fedora:
- CentOS/RHEL (versions 7/8+):
- Arch Linux:
(Arch typically keeps
python
linked to the latest Python 3 version).
You might also want to install pip
(the package installer) and venv
(for virtual environments) if they weren't included:
- Debian/Ubuntu:
sudo apt install python3-pip python3-venv
- Fedora:
sudo dnf install python3-pip
(venv
is usually included with python3) - CentOS/RHEL:
sudo yum install python3-pip
- Arch:
pip
andvenv
are typically included with thepython
package.
Using the Python Interactive Interpreter (REPL)
The Python interpreter offers an interactive mode known as a Read-Eval-Print Loop (REPL). This is an invaluable tool for:
- Experimenting: Trying out small code snippets quickly.
- Learning: Testing syntax and exploring features without creating files.
- Debugging: Inspecting values and testing function calls.
- Quick Calculations: Using Python as a powerful calculator.
To start the REPL, open your terminal and type python3
:
You'll see a welcome message with the Python version and then the primary prompt >>>
:
Python 3.10.6 (main, Nov 14 2022, 16:10:14) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>
Now you can type Python code directly at the prompt and press Enter. The interpreter will:
- Read your input.
- Evaluate the code (execute it).
- Print the result of the expression (if any).
- Loop back to the
>>>
prompt.
Examples in the REPL:
>>> print("Hello from the Python REPL!") # Using the print function
Hello from the Python REPL!
>>> 15 + 3 * 4 # Python follows order of operations (PEMDAS/BODMAS)
27
>>> (15 + 3) * 4 # Parentheses change the order
72
>>> my_distro = "Arch Linux" # Variable assignment
>>> my_distro # Typing a variable name prints its value
'Arch Linux'
>>> print(f"My favorite distribution is {my_distro}") # Using f-strings for formatting
My favorite distribution is Arch Linux
>>> len(my_distro) # Using the built-in len() function
10
>>> import os # Importing a module from the standard library
>>> os.getcwd() # Calling a function from the imported module
'/home/your_user' # Output will vary
>>> help(os.getcwd) # Get help on a specific function (press 'q' to exit help)
# ... help text displayed ...
>>> exit() # Command to leave the REPL
Ctrl+D
.
The REPL is your playground for learning and testing Python directly within your Linux terminal.
Workshop Introduction Exploring Python Interactively
This workshop guides you through verifying your Python installation and using the interactive REPL to perform basic operations.
Goal: Confirm Python 3 installation and become comfortable using the REPL for simple commands, calculations, variable assignment, and module exploration.
Steps:
-
Open Your Linux Terminal: Launch your preferred terminal application (e.g., Terminal, Konsole, xterm).
-
Check Python 3 Version:
- Type the command:
python3 --version
- Press Enter.
- Observe: Note the version number displayed (e.g.,
Python 3.x.y
). If you get an error like "command not found", refer back to the installation instructions for your specific Linux distribution in the previous section and install Python 3.
- Type the command:
-
Start the Python REPL:
- Type the command:
python3
- Press Enter.
- Observe: You should see the Python welcome message (including the version) and the
>>>
prompt, indicating you are now in the interactive interpreter.
- Type the command:
-
Perform Basic Arithmetic:
- At the
>>>
prompt, type:100 / 5
- Press Enter. Observe the result (
20.0
- note it's a float). - Type:
100 // 5
(Floor division) - Press Enter. Observe the result (
20
- an integer). - Type:
2 ** 10
(Exponentiation: 2 to the power of 10) - Press Enter. Observe the result (
1024
). - Type:
(5 + 3) * 10 / 4
- Press Enter. Observe the result (
20.0
). Python respects parentheses and operator precedence.
- At the
-
Work with Strings:
- Type:
"Hello" + " " + "Linux!"
(String concatenation) - Press Enter. Observe the combined string
Hello Linux!
. - Type:
'=' * 40
(String repetition) - Press Enter. Observe the string of 40 equals signs.
- Type:
message = "Python on Linux is powerful"
(Assign a string to a variable) - Press Enter (no output for assignment).
- Type:
message
- Press Enter. Observe the value of the
message
variable. - Type:
len(message)
- Press Enter. Observe the length of the string (
27
). - Type:
message.upper()
(Call a string method) - Press Enter. Observe the uppercase version (
PYTHON ON LINUX IS POWERFUL
). - Type:
message
- Press Enter. Observe that the original
message
variable is unchanged (strings are immutable).
- Type:
-
Assign Variables of Different Types:
- Type:
year = 2024
(Integer) - Type:
pi_approx = 3.14159
(Float) - Type:
is_learning = True
(Boolean - note capitalization) - Type:
result = None
(NoneType) - Press Enter after each assignment.
- Verify types using the
type()
function:- Type:
type(year)
and press Enter. Observe<class 'int'>
. - Type:
type(pi_approx)
and press Enter. Observe<class 'float'>
. - Type:
type(is_learning)
and press Enter. Observe<class 'bool'>
. - Type:
type(result)
and press Enter. Observe<class 'NoneType'>
.
- Type:
- Type:
-
Import a Module and Use It:
- Type:
import platform
(Imports theplatform
module for system info) - Press Enter.
- Type:
platform.system()
(Call thesystem()
function from the module) - Press Enter. Observe the output, which should be
'Linux'
. - Type:
platform.machine()
- Press Enter. Observe your system's architecture (e.g.,
'x86_64'
). - Type:
platform.python_version()
- Press Enter. Observe the detailed Python version string.
- Type:
-
Get Help:
- Type:
help(platform)
(Get help on the entire module) - Press Enter. Skim the help text. Press
q
to exit the help pager and return to the>>>
prompt. - Type:
help(len)
(Get help on a built-in function) - Press Enter. Read the description. Press
q
to exit.
- Type:
-
Exit the REPL:
- Type:
exit()
- Press Enter.
- Alternatively, press
Ctrl+D
. - Observe: You are returned to your regular Linux shell prompt (e.g.,
$
or#
).
- Type:
Conclusion: You have successfully verified your Python 3 installation and gained hands-on experience using the interactive REPL. You practiced performing calculations, manipulating strings, assigning variables, checking data types, importing modules, using module functions, and accessing built-in help. The REPL is a fundamental tool you'll use frequently as you learn and work with Python.
1. Setting Up the Python Environment
While the system's Python installation and the REPL are useful for simple tasks, real-world Python development requires managing dependencies and isolating project environments. Installing packages directly into the system's Python directories is strongly discouraged as it can lead to conflicts and break system tools. This section covers the standard tools and best practices: pip
for package management and venv
for creating isolated virtual environments.
The Python Package Installer pip
pip
is the standard package manager for Python. It allows you to find, install, upgrade, and remove Python packages from the Python Package Index (PyPI) – a vast public repository of Python software – and other sources.
Checking pip
Installation:
Most Python 3 installations include pip
. You can check if pip
corresponding to your python3
installation is available:
# Check pip linked to python3
pip3 --version
# Or sometimes just pip if python links to python3
pip --version
This command should output the pip
version and the Python version it manages (e.g., pip 23.0.1 from /usr/lib/python3.10/site-packages/pip (python 3.10)
). The path /usr/lib/...
indicates it's likely the system-installed pip
. If you are inside an activated virtual environment (covered next), the path will point inside the environment's directory.
Installing pip
(if necessary):
If pip3
(or pip
) is not found, install it using your distribution's package manager. It's often in a package named python3-pip
.
- Debian/Ubuntu:
sudo apt update && sudo apt install python3-pip
- Fedora:
sudo dnf install python3-pip
- CentOS/RHEL:
sudo yum install python3-pip
- Arch Linux: Included with the
python
package.
Basic pip
Usage:
(Note: When inside an activated virtual environment, you can usually just type pip
instead of pip3
.)
- Search for a package on PyPI:
- Install a package: Installs the latest version from PyPI.
- Install a specific version:
- Install minimum/maximum versions:
- Upgrade a package: Upgrades to the latest version allowed by constraints.
- List installed packages: Shows packages installed in the current environment (system or virtualenv).
- Show package details: Displays information about an installed package (version, dependencies, location).
- Check for outdated packages:
- Uninstall a package: Removes the package.
- Install packages from a requirements file (very common):
(More on
requirements.txt
later).
System-wide vs. User vs. Virtual Environment Installation:
- System Installation (
sudo pip install ...
): Installs packages globally into the system's Python site-packages directory (e.g.,/usr/lib/pythonX.Y/site-packages/
or/usr/local/lib/...
). This is strongly discouraged. It can interfere with packages managed by the Linux distribution's package manager (apt
,dnf
, etc.), potentially breaking system tools or creating version conflicts that are hard to resolve ("dependency hell"). Avoidsudo pip install
unless you have a very specific reason and understand the implications. - User Installation (
pip install --user ...
): Installs packages into a user-specific directory (e.g.,~/.local/lib/pythonX.Y/site-packages
). This avoids modifying system files but doesn't provide project isolation. Different projects might still require conflicting versions of the same package installed in the user site. Better than system-wide, but not ideal for project development. - Virtual Environment Installation (
pip install ...
inside an activated venv): Installs packages into a dedicated directory for a specific project. This is the recommended approach for almost all Python development. It keeps dependencies isolated, prevents conflicts, and ensures projects have reproducible environments.
Virtual Environments venv
A virtual environment is a self-contained directory structure that includes a specific version of the Python interpreter and its own independent set of installed Python packages in its site-packages
directory.
Why Use Virtual Environments?
- Dependency Isolation: The primary reason. Project A might need
Django==3.2
, while Project B requiresDjango==4.1
. Without virtual environments, installing one would break the other. With virtual environments, each project has its ownDjango
installation within its.venv
directory, completely separate from the system or other projects. - Clean System Environment: Prevents cluttering your global or user Python
site-packages
with project-specific dependencies. This keeps your base Python installation clean and less prone to conflicts. - Reproducibility: You can easily list the exact packages and versions used by a project (via
pip freeze > requirements.txt
) within its virtual environment. This allows anyone else (or yourself on a different machine) to recreate the exact same environment usingpip install -r requirements.txt
, ensuring the code runs reliably. - Permissions: Avoids the need for
sudo
when installing packages, as installations happen within a project directory owned by your user. - Version Management: Allows you to easily test your project with different versions of Python (by creating environments pointing to different base Python installations, although
venv
primarily uses the Python it was created with) or different library versions without affecting other projects.
Using the venv
Module:
Python 3.3+ includes the built-in venv
module for creating virtual environments. It's the standard, recommended tool.
Steps to Create and Use a Virtual Environment:
-
Navigate to your project directory: Create a directory for your project first. It's common practice to create the virtual environment inside the project directory.
-
Create the virtual environment: Use the
This command does the following:python3 -m venv
command, followed by the name you want to give the environment directory. Common names are.venv
(leading dot hides it inls
by default),venv
, orenv
. Using.venv
is increasingly standard.- Creates the
.venv
directory. - Creates subdirectories like
.venv/bin
,.venv/lib
,.venv/include
. - Copies or symlinks the Python interpreter (
python3
) into.venv/bin/
. - Installs basic tools like
pip
andsetuptools
into the new environment'ssite-packages
(.venv/lib/pythonX.Y/site-packages/
).
- Creates the
-
Activate the virtual environment: Before you can use the environment, you need to activate it. Activation scripts modify your current shell session's environment variables (primarily
PATH
) so that commands likepython
andpip
point to the versions inside the.venv
directory instead of the system-wide ones. The activation command depends on your shell:- Bash/Zsh (most common on Linux):
- Fish:
- Csh/Tcsh:
Observation: After successful activation, your shell prompt will usually be prefixed with the name of the environment directory, like
(.venv) user@hostname:~/my_cool_project$
. This indicates the virtual environment is active.
-
Install packages: Now that the environment is active, use
pip
(you can usually just typepip
) to install packages. They will be installed only within this environment'ssite-packages
directory. -
Work on your project: Create your Python scripts (
.py
files) within themy_cool_project
directory. When you run them usingpython your_script.py
(while the environment is active), they will use the Python interpreter and packages from the.venv
. -
Deactivate the virtual environment: When you're finished working on the project in this shell session, deactivate the environment to revert
Thepython
,pip
, etc., back to the system defaults.(.venv)
prefix will disappear from your prompt.
Best Practices:
- Create a new virtual environment for each Python project.
- Activate the environment before installing project dependencies.
- Add the virtual environment directory name (e.g.,
.venv/
) to your project's.gitignore
file if using Git. The environment contains installed packages and can be large; it should be recreated fromrequirements.txt
, not committed to version control. - Use
pip freeze > requirements.txt
to record your project's dependencies.
Workshop Setting Up the Python Environment
This workshop guides you through creating a project directory, setting up a virtual environment using venv
, activating it, installing an external package (cowsay
) using pip
, verifying the installation, creating a simple script that uses the package, and deactivating the environment.
Goal: Understand and practice the standard workflow for creating isolated Python project environments and managing dependencies with pip
and venv
.
Project: A simple command-line script that uses the cowsay
package to make an ASCII cow deliver a message.
Steps:
-
Create a Project Directory:
- Open your Linux terminal.
- Choose a location for your projects (e.g.,
~/projects
,~/python_projects
). - Create a new directory specifically for this workshop project:
- Navigate into the newly created directory:
-
Create the Virtual Environment:
- Use the
venv
module to create a virtual environment. We'll name it.venv
(the leading dot makes it hidden by default in simplels
output). - Verify (Optional): List the contents of the directory, including hidden files/directories, to see
.venv
: You should see.venv
listed. You can explore its structure (ls .venv
,ls .venv/bin
,ls .venv/lib
) to get familiar, but it's not necessary.
- Use the
-
Activate the Virtual Environment:
- Activate the environment using the script appropriate for your shell (usually the first one for Bash/Zsh):
- Observe: Your terminal prompt should now be prefixed with
(.venv)
, like(.venv) your_user@hostname:~/cowsay_project$
. This confirms the environment is active. If it doesn't change, double-check the command and ensure you usedsource
.
-
Check Python and Pip within the Environment:
- Verify that the
python
andpip
commands now point to the versions inside your virtual environment: - Observe: The output paths should start with
/home/your_user/cowsay_project/.venv/bin/
. This confirms that activating the environment correctly modified yourPATH
. - Check the versions again (optional, they should match the Python used to create the venv):
- Verify that the
-
Install the
cowsay
Package:- With the virtual environment active, use
pip
to install thecowsay
package from PyPI: - Observe:
pip
will connect to PyPI, download thecowsay
package (and any dependencies it might have, thoughcowsay
is simple), and install them into your virtual environment'ssite-packages
directory (.venv/lib/pythonX.Y/site-packages/
). You'll see output indicating successful installation.
- With the virtual environment active, use
-
Verify Package Installation:
- List the packages installed specifically within this active environment:
- Observe: You should see
cowsay
listed, along with standard packages likepip
andsetuptools
. - Show details about the installed
cowsay
package: - Observe: Note the version, installation location (should be inside
.venv
), and other metadata.
-
Create the Python Script (
moo.py
):- Use a text editor (like
nano
,vim
,gedit
, VS Code, etc.) to create a new file namedmoo.py
within thecowsay_project
directory. - Enter the following Python code into the file:
# File: moo.py import cowsay import sys # Import the sys module to access command-line arguments # Default message if no arguments are provided default_message = "Hello from Python in a venv!" # Check if any command-line arguments were given (sys.argv[0] is the script name) if len(sys.argv) > 1: # Join all arguments after the script name into a single message string message_to_say = ' '.join(sys.argv[1:]) else: # Use the default message if no arguments were provided message_to_say = default_message # Use the cowsay module's 'cow' function to print the message # This function is provided by the package we installed print("\nThe cow says:") cowsay.cow(message_to_say) # Optional: Display the arguments received for clarity # print("\nArguments received by script:", sys.argv)
- Save the file and exit the editor (e.g., in
nano
, pressCtrl+X
, thenY
to confirm saving, then Enter to confirm the filename).
- Use a text editor (like
-
Run the Python Script:
- Make sure your virtual environment is still active (
(.venv)
prefix in prompt). - Execute the script using the
python
interpreter (which points to the one in.venv
): - Observe: Since no command-line arguments were provided, the script uses the
default_message
, and you should see an ASCII cow saying "Hello from Python in a venv!". - Run the script again, this time providing a custom message as command-line arguments after the script name:
- Observe: The ASCII cow should now be saying "Linux environments are awesome!". The script successfully imported and used the
cowsay
package installed in the virtual environment.
- Make sure your virtual environment is still active (
-
Deactivate the Virtual Environment:
- When you're done working in this environment for now, deactivate it:
- Observe: The
(.venv)
prefix disappears from your terminal prompt, indicating you are back to using the system's default Python environment.
-
Verify Isolation (Optional):
- With the environment deactivated, try running the script again:
- Observe: You will likely get a
ModuleNotFoundError: No module named 'cowsay'
. This is expected! Thecowsay
package was installed only inside the.venv
directory. Since the environment is no longer active, the system's Python interpreter cannot find thecowsay
module. This clearly demonstrates the isolation provided by virtual environments.
Conclusion: You have successfully created an isolated Python project environment using venv
, activated it, installed an external package (cowsay
) using pip
specifically into that environment, and executed a script that utilized the installed package. You also verified that the package is not available when the environment is deactivated. This complete workflow – create environment, activate, install dependencies, code, deactivate – is fundamental for managing Python projects effectively and avoiding dependency conflicts.
2. Python Syntax Fundamentals
Before diving into complex programs, mastering the basic building blocks of Python syntax is essential. This includes understanding how to store data in variables, recognizing Python's fundamental data types, using comments effectively, grasping the critical role of indentation for code structure, and performing basic input and output operations.
Variables and Assignment
Variables act as named containers for storing data values in your program. In Python, you don't need to declare the type of a variable beforehand (like in C++ or Java). The type is determined dynamically when you assign a value to it using the assignment operator (=
).
- Assignment: The basic syntax is
variable_name = value
. - Naming Rules:
- Must start with a letter (a-z, A-Z) or an underscore (
_
). - Can contain letters, numbers (0-9), and underscores after the first character.
- Are case-sensitive (
user_count
is different fromUser_Count
). - Cannot be a Python keyword (reserved words like
if
,else
,for
,while
,class
,def
,import
,True
,False
,None
, etc.). You can see the list by runningimport keyword; print(keyword.kwlist)
in the REPL.
- Must start with a letter (a-z, A-Z) or an underscore (
- Naming Conventions (PEP 8): The official Python style guide (PEP 8) recommends using
snake_case
for variable names (all lowercase words separated by underscores). This improves readability.- Good:
user_name
,file_path
,total_count
- Less Conventional:
userName
,FilePath
,TotalCount
(CamelCase is usually used for Class names) - Avoid:
__my_var__
(double underscores have special meaning), single letters likel
,O
,I
(can be confused with numbers).
- Good:
# Assigning an integer value
file_count = 5
# Assigning a string value (text)
system_hostname = "matrix-server"
# Assigning a floating-point number (decimal)
cpu_temperature = 45.7
# Assigning a boolean value (True or False)
is_service_running = True
# Variables can be reassigned, potentially changing their type (Dynamic Typing)
file_count = "approx. five" # file_count now holds a string, not an integer
print(file_count)
# Multiple assignment
x, y, z = 10, "hello", False
print(x) # Output: 10
print(y) # Output: hello
print(z) # Output: False
Basic Data Types
Python has several built-in data types. Here are the most fundamental ones you'll encounter constantly:
- Integer (
int
): Represents whole numbers (positive, negative, or zero) without a fractional component. Examples:-10
,0
,42
,1000000
. Python 3 integers have arbitrary precision, meaning they can grow as large as your system's memory allows. - Float (
float
): Represents numbers with a decimal point or numbers expressed in exponential notation (scientific notation). Examples:3.14159
,-0.001
,99.9
,2.5e2
(which means 2.5 * 10^2 = 250.0). Floats are typically implemented as IEEE 754 double-precision numbers, which means they can sometimes have small precision limitations inherent in representing decimal fractions in binary. - String (
str
): Represents sequences of Unicode characters, used for text. Strings are enclosed in either single quotes ('...'
) or double quotes ("..."
). They are immutable, meaning once a string object is created, its content cannot be changed in place (operations like concatenation create new string objects).- Triple quotes (
'''...'''
or"""..."""
) can be used for multi-line strings or docstrings (documentation strings).
- Triple quotes (
- Boolean (
bool
): Represents truth values. Can only hold one of two possible values:True
orFalse
(note the capitalization). Booleans are crucial for conditional logic (if
statements) and comparisons. - NoneType (
None
): A special type that has only one value:None
. It represents the absence of a value or a null value. It's often used to initialize variables before they are assigned a meaningful value, or as a return value from functions that don't explicitly return anything else.
You can check the type of any variable or value using the built-in type()
function:
>>> type(system_hostname)
<class 'str'>
>>> type(port_number)
<class 'int'>
>>> type(load_average)
<class 'float'>
>>> type(is_admin)
<class 'bool'>
>>> type(database_connection)
<class 'NoneType'>
Comments
Comments are non-executable lines in your code used to explain what the code does, why it does it, or to leave notes for yourself or others. Python ignores comments.
- Single-line comments: Start with a hash symbol (
#
) and continue to the end of the physical line. - Multi-line comments (Convention): Python doesn't have a dedicated block comment syntax like
/* ... */
in C or Java. The convention is to use multiple single-line#
comments.While multi-line strings (# This function performs a complex calculation. # It takes several inputs and needs careful handling # of edge cases. def complex_calculation(a, b, c): # ... function code ... pass # 'pass' is a null statement, a placeholder
"""..."""
or'''...'''
) not assigned to a variable are sometimes used as block comments, they are technically string literals. Their primary use is for docstrings (documentation strings), which are the first statement inside a module, class, or function definition and have special meaning. Use#
for regular comments.
Indentation The Cornerstone of Python Structure
Unlike many languages that use curly braces {}
or keywords like begin
/end
to define blocks of code (e.g., the body of an if
statement, for
loop, function, or class), Python uses indentation.
- Significance: The level of indentation (whitespace at the beginning of a line) is syntactically significant. It groups statements together into blocks. All statements within the same block must have the same level of indentation.
- Standard: The universally accepted standard (defined in PEP 8) is to use 4 spaces per indentation level. Do not mix tabs and spaces within the same file, as this can lead to confusing errors (
TabError
). Configure your text editor to insert 4 spaces when you press the Tab key. - Blocks: A code block starts after a statement ending with a colon (
:
) – such asif
,elif
,else
,for
,while
,def
,class
– and consists of all subsequent lines indented more than the line with the colon. The block ends when the indentation level returns to that of the line containing the colon (or less).
# Example demonstrating indentation
threshold = 80.0
current_usage = 85.5
if current_usage > threshold:
# This block is indented 4 spaces relative to the 'if' line
print("Warning: Usage exceeds threshold!")
print(f"Current usage: {current_usage}%") # Still part of the 'if' block
# Nested block example
if current_usage > 95.0:
# This block is indented 8 spaces (4 more than the outer 'if' block)
print("CRITICAL: Usage is extremely high!")
# End of the nested 'if' block
print("Taking precautionary measures...") # Back to the outer 'if' block's indentation level
# End of the outer 'if' block (indentation returns to zero)
print("Usage check complete.") # This line is outside the 'if' block and always executes
Incorrect indentation will cause errors:
# Incorrect: Missing indentation
if current_usage > threshold:
print("Warning!") # -> IndentationError: expected an indented block
# Incorrect: Inconsistent indentation within a block
if current_usage > threshold:
print("Warning!")
print("Usage:", current_usage) # -> IndentationError: unexpected indent
Consistent, correct indentation is non-negotiable in Python and fundamental to writing working code.
Basic Input and Output
Programs often need to interact with the user or display results.
-
Output (
print()
): The built-inprint()
function displays output to the standard output stream (usually your terminal).- Can print strings, variables, or the results of expressions.
- Takes one or more arguments, separated by commas. By default, it prints them separated by spaces and adds a newline character at the end.
- f-strings (Formatted String Literals): Introduced in Python 3.6, f-strings (prefixed with
f
orF
) are the modern and highly recommended way to embed expressions inside string literals for formatting.
print("System check starting...") user = "alice" login_count = 5 print("User:", user, "Login count:", login_count) # Multiple arguments # Using f-string (preferred) print(f"User '{user}' has logged in {login_count} times today.") print(f"CPU Temperature: {cpu_temperature}°C") print(f"Pi * 2 = {pi_value * 2}") # Controlling print ending and separator print("Part 1", end="...") # Don't add newline, add '...' instead print("Part 2") # Adds newline by default print("A", "B", "C", sep="|") # Use '|' as separator instead of space
-
Input (
input()
): The built-ininput()
function reads a line of text from the standard input stream (usually the keyboard in the terminal).- It optionally takes a string argument, which is displayed as a prompt to the user.
- It always returns the user's input as a string (
str
), even if they type numbers. - You often need to explicitly convert the returned string to the desired type (e.g.,
int()
,float()
) if you want to perform numerical operations.
Using# Prompt the user for their name user_name = input("Please enter your name: ") print(f"Hello, {user_name}!") # Get age input (returns a string) age_str = input(f"Hi {user_name}, how old are you? ") # Attempt to convert the age string to an integer try: age_int = int(age_str) print(f"Okay, you are {age_int} years old.") print(f"Next year you will be {age_int + 1}.") # Now we can do math except ValueError: # Handle the case where the user didn't enter a valid number print(f"'{age_str}' doesn't look like a valid age number. Please enter digits.") # Get numerical input and convert to float try: temp_str = input("Enter current temperature in Celsius: ") temp_float = float(temp_str) print(f"The temperature is {temp_float}°C.") except ValueError: print("Invalid temperature format. Please enter a number (e.g., 21.5).")
try...except
(covered later in Error Handling) is crucial when converting user input to prevent crashes if the user enters unexpected text.
Workshop Python Syntax Fundamentals
This workshop focuses on practicing variable assignment using different data types, understanding indentation within a simple if
/else
structure, and interacting with the user using input()
and print()
, including basic type conversion.
Goal: Write a script that takes user input (name, favorite Linux command, year started using Linux), stores it in variables, performs a simple conditional check based on the command, calculates approximate years of Linux usage, and prints formatted output.
Project: Simple Linux User Info Script
Steps:
-
Navigate to Your Project Area:
- Open your terminal.
cd
into a directory where you keep your Python projects (e.g., the one used in the previous workshop, or create a new one).- If you created a new directory, remember to set up and activate a virtual environment within it (
python3 -m venv .venv
thensource .venv/bin/activate
). If reusing a previous project directory, ensure its virtual environment is active.
-
Create the Python Script:
- Create a new file named
user_info.py
using your text editor (e.g.,nano user_info.py
).
- Create a new file named
-
Write the Code: Enter the following Python code into
user_info.py
. Pay close attention to the comments explaining each part and especially to the 4-space indentation for theif
/elif
/else
andtry
/except
blocks.# File: user_info.py # Import the datetime module from the standard library to get the current year import datetime # --- Get User Input (Always comes as strings) --- print("--- Welcome to Linux User Info ---") # Ask for the user's name user_name = input("What is your name? ") # Ask for their favorite command # Use an f-string in the prompt itself fav_command = input(f"Nice to meet you, {user_name}! What's your favorite Linux command? ") # Ask for the year they started using Linux start_year_str = input("Approximately what year did you start using Linux? ") # --- Process Input and Perform Calculations --- # Get the current year dynamically current_year = datetime.datetime.now().year # Initialize years_using_linux to None; it will be calculated if possible years_using_linux = None start_year_int = None # Also initialize the integer version # Attempt to convert the start_year_str to an integer # Use a try-except block to handle cases where the input is not a number try: start_year_int = int(start_year_str) # Conversion attempt # If conversion succeeds, calculate the years years_using_linux = current_year - start_year_int except ValueError: # This block executes if int(start_year_str) fails print(f"\nWarning: '{start_year_str}' doesn't seem like a valid year. Cannot calculate usage duration.") # --- Display Output --- print("\n--- Generating Report ---") # Print a personalized greeting print(f"Okay, {user_name}, here's your info:") # Use an if/elif/else structure based on the favorite command entered # Use .strip() to remove accidental spaces and .lower() for case-insensitive comparison cleaned_command = fav_command.strip().lower() if cleaned_command == "ls": # Indented block for 'ls' print(" Favorite Command: ls (Classic choice for listing files!)") elif cleaned_command in ["grep", "awk", "sed"]: # Indented block for text processing commands print(f" Favorite Command: {fav_command} (Powerful text processing!)") elif cleaned_command.startswith("git"): # Indented block for git commands print(f" Favorite Command: {fav_command} (Essential for version control!)") else: # Indented block for any other command print(f" Favorite Command: {fav_command} (Interesting choice!)") # End of if/elif/else block # Print the calculated years using Linux, only if the calculation was successful # Check if start_year_int and years_using_linux are not None AND the value is reasonable if start_year_int is not None and years_using_linux is not None: # Indented block for valid year calculation print(f" Started Linux around: {start_year_int}") if years_using_linux < 0: # Indented block for future year print(" Wow, starting in the future? Impressive!") elif years_using_linux == 0: # Indented block for current year print(" Welcome to the Linux world this year!") else: # Indented block for past year # Use 'year' or 'years' appropriately year_str = "year" if years_using_linux == 1 else "years" print(f" Approximate time using Linux: {years_using_linux} {year_str}.") # End of inner if/elif/else else: # Indented block if year calculation failed print(" Could not calculate Linux usage duration due to invalid year input.") # End of outer if/else block print("\n--- Report Complete ---")
-
Save and Exit: Save the file and exit your text editor.
-
Run the Script:
- Make sure your virtual environment is still active (
(.venv)
prefix). - Execute the script from your terminal:
- Interact: The script will prompt you for your name, favorite command, and the year you started using Linux. Provide answers.
- Example Interaction:
--- Welcome to Linux User Info --- What is your name? Biff Nice to meet you, Biff! What's your favorite Linux command? grep Approximately what year did you start using Linux? 2010 --- Generating Report --- Okay, Biff, here's your info: Favorite Command: grep (Powerful text processing!) Started Linux around: 2010 Approximate time using Linux: 14 years. --- Report Complete ---
- Example Interaction:
- Observe: Carefully analyze the output.
- See how your input name is used in subsequent prompts and the final report.
- Notice how the message about the favorite command changes depending on whether you entered
ls
,grep
,git status
, or something else. This tests theif
/elif
/else
block. - Check if the calculation for years using Linux is correct based on the year you entered and the current year.
- Make sure your virtual environment is still active (
-
Test Edge Cases: Run the script a few more times to test different scenarios:
- Different Commands: Enter
ls
,git push
,htop
as favorite commands and see the output change. Try entering a command with leading/trailing spaces (e.g.," cd "
) to test.strip()
. Try enteringLs
(with capital L) to test.lower()
. - Invalid Year: When prompted for the year, enter text like "last year" or "abc". Observe the warning message printed by the
except ValueError:
block and see how the final report indicates the duration couldn't be calculated. - Current Year: Enter the current year. See the specific message for starting this year.
- Future Year: Enter a year in the future (e.g., 2077). Observe the "starting in the future" message.
- One Year Ago: Enter the previous year. Ensure it correctly says "1 year".
- Different Commands: Enter
Code Explanation Recap:
import datetime
: Brought in the tools needed to get thecurrent_year
.input(...)
: Used to get string input from the user, with prompts.f"..."
: F-strings used for easy formatting of output messages including variable values.current_year = datetime.datetime.now().year
: Got the current year dynamically.try...except ValueError
: Safely attempted to convert the year string to an integer usingint()
. If the conversion failed (because the input wasn't a number), theexcept
block ran, printing a warning and preventing a crash.if/elif/else
: Used conditional logic to print different messages based on the value offav_command
..strip().lower()
was used to make the comparison robust against extra spaces and case differences.is not None
: Checked if the year conversion and calculation were successful before attempting to print the results.- Indentation: All the code blocks under
if
,elif
,else
,try
, andexcept
were indented exactly 4 spaces, which is crucial for Python to understand the program structure.
Conclusion: You have successfully written and tested a Python script that demonstrates fundamental syntax. You practiced variable assignment with different data types (str
, int
, None
), used input()
and print()
for user interaction, performed type conversion (int()
) with error handling (try
/except
), and implemented conditional logic (if
/elif
/else
). You also saw the critical importance of correct indentation.
3. Basic Data Structures Lists Tuples and Dictionaries
Python provides several powerful built-in data structures for organizing collections of data efficiently. The most fundamental and commonly used are lists, tuples, and dictionaries. Understanding their characteristics – particularly mutability (can they be changed?) and ordering – is key to choosing the right structure for your needs.
Lists (list
)
A list is an ordered, mutable (changeable) sequence of items.
- Definition: Created using square brackets
[]
, with items separated by commas. - Ordered: The order of items is preserved. The item at index 0 is always the first item added (unless modified).
- Mutable: You can change the list after its creation: add items, remove items, or change the value of existing items.
- Heterogeneous: Items in a list can be of different data types (though often they are homogeneous for clarity).
- Indexing: Access individual items using their zero-based index inside square brackets (e.g.,
my_list[0]
for the first item,my_list[-1]
for the last item). - Slicing: Extract sub-lists using the slice notation
[start:stop:step]
.start
is inclusive,stop
is exclusive.
# Creating lists
empty_list = []
distros = ["Ubuntu", "Fedora", "Debian", "Arch", "Mint"]
mixed_data = [10, "kernel", 5.15, True, ["eth0", "wlan0"]] # List can contain other lists
print(f"Distros list: {distros}")
print(f"Number of distros: {len(distros)}") # len() gets the number of items
# --- Accessing Items (Indexing) ---
first_distro = distros[0]
third_distro = distros[2]
last_distro = distros[-1]
second_last = distros[-2]
print(f"First: {first_distro}, Third: {third_distro}, Last: {last_distro}")
# Accessing item within a nested list
network_interfaces = mixed_data[4]
first_interface = network_interfaces[0]
# Or directly:
first_interface_direct = mixed_data[4][0]
print(f"First network interface: {first_interface_direct}")
# --- Slicing ---
# Get items from index 1 up to (but not including) index 4
middle_distros = distros[1:4] # ['Fedora', 'Debian', 'Arch']
print(f"Middle slice: {middle_distros}")
# Get items from index 2 to the end
from_index_2 = distros[2:] # ['Debian', 'Arch', 'Mint']
print(f"From index 2: {from_index_2}")
# Get the first 3 items
first_three = distros[:3] # ['Ubuntu', 'Fedora', 'Debian']
print(f"First three: {first_three}")
# Get a copy of the entire list
list_copy = distros[:]
print(f"List copy: {list_copy}")
# Get every second item
every_other = distros[::2] # ['Ubuntu', 'Debian', 'Mint']
print(f"Every other: {every_other}")
# Reverse the list using slicing
reversed_list = distros[::-1] # ['Mint', 'Arch', 'Debian', 'Fedora', 'Ubuntu']
print(f"Reversed: {reversed_list}")
print(f"Original list unchanged: {distros}") # Slicing creates NEW lists
# --- Modifying Lists (Mutability) ---
# Change an item by index
distros[1] = "Fedora Linux 38"
print(f"After modification: {distros}")
# Add an item to the end
distros.append("openSUSE")
print(f"After append: {distros}")
# Insert an item at a specific index
distros.insert(2, "Linux Mint") # Insert before current index 2 ('Debian')
print(f"After insert: {distros}")
# Remove the last item and return it
last_item_removed = distros.pop()
print(f"Removed '{last_item_removed}' using pop(): {distros}")
# Remove item at a specific index and return it
item_at_index_3 = distros.pop(3) # Removes 'Arch'
print(f"Removed '{item_at_index_3}' using pop(3): {distros}")
# Remove the first occurrence of a specific value
try:
distros.remove("Debian")
print(f"After removing 'Debian': {distros}")
except ValueError:
print("Value 'Debian' not found in the list.")
# distros.remove("NonExistent") # Would raise ValueError if uncommented
# Extend list by appending elements from another iterable
more_distros = ["Gentoo", "Slackware"]
distros.extend(more_distros)
# Alternatively: distros += more_distros
print(f"After extend: {distros}")
# --- Other Common List Operations ---
# Check if an item exists in the list
if "Ubuntu" in distros:
print("'Ubuntu' is in the list.")
# Find the index of the first occurrence of a value
try:
ubuntu_index = distros.index("Ubuntu")
print(f"Index of 'Ubuntu': {ubuntu_index}")
except ValueError:
print("'Ubuntu' not found.")
# Count occurrences of a value
distros.append("Ubuntu") # Add another Ubuntu
ubuntu_count = distros.count("Ubuntu")
print(f"Count of 'Ubuntu': {ubuntu_count}")
# Sort the list in place (modifies the original list)
distros.sort() # Alphabetical sort for strings
print(f"Sorted list: {distros}")
# Sort in reverse order
distros.sort(reverse=True)
print(f"Reverse sorted list: {distros}")
# Reverse the list in place
distros.reverse()
print(f"Reversed in place: {distros}")
# Clear all items from the list
distros.clear()
print(f"Cleared list: {distros}")
Tuples (tuple
)
A tuple is an ordered, immutable (unchangeable) sequence of items.
- Definition: Created using parentheses
()
, with items separated by commas. A comma is required for single-item tuples(item,)
. - Ordered: The order of items is preserved.
- Immutable: Once a tuple is created, you cannot change its contents (no adding, removing, or modifying items). This makes them suitable for representing fixed collections like coordinates, RGB colors, or records where data integrity is important.
- Heterogeneous: Like lists, tuples can contain items of different data types.
- Indexing and Slicing: Work exactly the same way as for lists, allowing you to access items or create new tuples from slices.
# Creating tuples
empty_tuple = ()
point_2d = (10, 20) # Represents (x, y) coordinates
server_config = ("webserver01", "192.168.1.50", 80)
rgb_color = (255, 0, 128) # Red, Green, Blue
# Single-item tuple REQUIRES a trailing comma
single_item = ("hello",)
not_a_tuple = ("hello") # This is just a string in parentheses
print(f"Type of single_item: {type(single_item)}") # <class 'tuple'>
print(f"Type of not_a_tuple: {type(not_a_tuple)}") # <class 'str'>
print(f"Point: {point_2d}")
print(f"Server Config: {server_config}")
print(f"Number of items in server_config: {len(server_config)}")
# --- Accessing Items (Indexing and Slicing) ---
x_coordinate = point_2d[0]
server_ip = server_config[1]
last_color_value = rgb_color[-1]
print(f"X: {x_coordinate}, IP: {server_ip}, Blue: {last_color_value}")
# Slicing creates NEW tuples
ip_and_port = server_config[1:] # ('192.168.1.50', 80)
print(f"IP and Port slice: {ip_and_port}")
# --- Immutability (These operations will FAIL) ---
# point_2d[0] = 15 # Raises TypeError: 'tuple' object does not support item assignment
# server_config.append(True) # Raises AttributeError: 'tuple' object has no attribute 'append'
# server_config.remove(80) # Raises AttributeError: 'tuple' object has no attribute 'remove'
# --- Use Cases ---
# Tuples are hashable (if they contain only hashable types like str, int, tuple),
# meaning they can be used as dictionary keys or elements in sets. Lists cannot.
location_lookup = {
(40.7128, -74.0060): "New York City",
(34.0522, -118.2437): "Los Angeles"
}
print(f"Location for (40.7128, -74.0060): {location_lookup.get((40.7128, -74.0060))}")
# Function returning multiple values implicitly returns a tuple
def get_coordinates():
return 50, 60 # Returns the tuple (50, 60)
coords = get_coordinates()
print(f"Coordinates from function: {coords}, type: {type(coords)}")
# --- Tuple Packing and Unpacking ---
# Packing: Assigning multiple values to a single tuple variable
packed_tuple = 1, 2, "three" # Parentheses are optional here
print(f"Packed tuple: {packed_tuple}")
# Unpacking: Assigning items from a tuple to multiple variables
# The number of variables must match the number of items in the tuple
a, b, c = packed_tuple
print(f"Unpacked: a={a}, b={b}, c='{c}'")
# Unpacking is useful with function return values
lat, lon = get_coordinates()
print(f"Latitude: {lat}, Longitude: {lon}")
# --- Limited Tuple Methods ---
# Count occurrences of a value
counts = (1, 2, 'a', 2, 'b', 2)
print(f"Count of 2 in counts: {counts.count(2)}") # Output: 3
# Find the index of the first occurrence
try:
index_a = counts.index('a')
print(f"Index of 'a': {index_a}") # Output: 2
except ValueError:
print("'a' not found.")
Dictionaries (dict
)
A dictionary is a mutable collection of key-value pairs.
- Definition: Created using curly braces
{}
with pairs written askey: value
, separated by commas. - Keys: Must be unique within a dictionary. Must be of an immutable type (e.g., strings, numbers, tuples containing only immutable types). Keys are used to look up values.
- Values: Can be of any data type (including lists, other dictionaries) and can be duplicated.
- Order: As of Python 3.7, dictionaries are insertion ordered. They remember the order in which key-value pairs were added. In Python 3.6 and earlier, dictionaries were unordered.
- Access: Values are accessed using their corresponding key inside square brackets
my_dict[key]
. Accessing a non-existent key raises aKeyError
. - Mutable: You can add new key-value pairs, change the value associated with an existing key, or remove pairs.
# Creating dictionaries
empty_dict = {}
user_permissions = {
"alice": "admin",
"bob": "editor",
"charlie": "viewer"
}
package_info = {
"name": "requests",
"version": "2.28.1",
"license": "Apache 2.0",
"dependencies": ["certifi", "charset-normalizer", "idna", "urllib3"]
}
print(f"User permissions: {user_permissions}")
print(f"Package info: {package_info}")
print(f"Number of users: {len(user_permissions)}")
# --- Accessing Values ---
alices_role = user_permissions["alice"] # Access using key
print(f"Alice's role: {alices_role}")
# Accessing a nested value
package_license = package_info["license"]
first_dependency = package_info["dependencies"][0]
print(f"Package license: {package_license}")
print(f"First dependency: {first_dependency}")
# Accessing a non-existent key raises KeyError
# print(user_permissions["david"]) # -> KeyError: 'david'
# Safer access using .get() method
# Returns None (or a specified default) if the key is not found
davids_role = user_permissions.get("david")
print(f"David's role (using get): {davids_role}") # Output: None
davids_role_default = user_permissions.get("david", "guest") # Provide default
print(f"David's role (using get with default): {davids_role_default}") # Output: guest
# --- Modifying Dictionaries ---
# Add a new key-value pair
user_permissions["david"] = "editor"
print(f"After adding David: {user_permissions}")
# Change the value for an existing key
user_permissions["charlie"] = "commenter"
print(f"After changing Charlie's role: {user_permissions}")
# Update with multiple pairs from another dictionary or iterable of pairs
new_users = {"eve": "admin", "frank": "viewer"}
user_permissions.update(new_users)
# Also works: user_permissions.update(bob="moderator", grace="tester")
print(f"After update: {user_permissions}")
# --- Removing Items ---
# Remove by key using pop(), returns the removed value
bobs_role = user_permissions.pop("bob")
print(f"Removed Bob's role ('{bobs_role}'), dict is now: {user_permissions}")
# user_permissions.pop("nonexistent") # -> KeyError
# Remove by key using pop() with a default (avoids KeyError)
removed_role = user_permissions.pop("george", "not found")
print(f"Result of popping 'george': {removed_role}")
# Remove the last inserted item (Python 3.7+) using popitem()
last_added_key, last_added_value = user_permissions.popitem()
print(f"Removed last item: key='{last_added_key}', value='{last_added_value}'")
print(f"Dict after popitem: {user_permissions}")
# Remove by key using the 'del' statement
del user_permissions["alice"]
print(f"After deleting Alice: {user_permissions}")
# del user_permissions["nonexistent"] # -> KeyError
# --- Other Common Dictionary Operations ---
# Check if a key exists
if "eve" in user_permissions: # Checks KEYS
print("'eve' is a key in the dictionary.")
if "admin" in user_permissions.values(): # Check VALUES
print("'admin' is a value in the dictionary.")
# Getting Views (dynamic views of keys, values, items)
keys = user_permissions.keys()
values = user_permissions.values()
items = user_permissions.items() # Pairs of (key, value)
print(f"Keys view: {keys}")
print(f"Values view: {values}")
print(f"Items view: {items}")
# These views reflect changes in the dictionary
# --- Iterating through Dictionaries ---
print("\nIterating through keys (default):")
for user in user_permissions:
print(f" User (key): {user}, Role (value): {user_permissions[user]}")
print("\nIterating through values:")
for role in user_permissions.values():
print(f" Role: {role}")
print("\nIterating through items (key-value pairs):")
for user, role in user_permissions.items(): # Unpacking the (key, value) tuple
print(f" User: {user}, Role: {role}")
# Clear all items
user_permissions.clear()
print(f"Cleared dictionary: {user_permissions}")
Choosing the Right Structure:
- Use a list when you need an ordered collection and might need to change, add, or remove items based on their position (index). Order matters.
- Use a tuple when you need an ordered collection that should not change after creation. Useful for fixed data, coordinates, or when you need hashable items (like dictionary keys). Order matters.
- Use a dictionary when you need to associate unique keys with values for fast lookups based on the key. The key-value relationship is primary; order is secondary (though preserved in modern Python).
Workshop Basic Data Structures
This workshop focuses on using lists and dictionaries to manage simple collections of data relevant to a Linux user: favorite shell aliases (list of tuples) and a simple software package inventory (dictionary).
Goal: Create a script that uses a list to store and display favorite aliases and a dictionary to manage (add, remove, view) a basic inventory of installed software packages with their versions.
Project: Linux Alias Viewer and Software Inventory Manager
Steps:
-
Set Up:
- Navigate to your project directory in the terminal.
- Ensure your virtual environment is active (
source .venv/bin/activate
). - Create a new Python file named
linux_manager.py
(nano linux_manager.py
).
-
Write the Code: Enter the following Python code, paying attention to comments and indentation.
# File: linux_manager.py # --- Initial Data Structures --- # List of favorite aliases (using tuples: (alias_name, actual_command)) # Tuples are suitable here as aliases usually don't change frequently once defined. favorite_aliases = [ ('ll', 'ls -alF'), ('la', 'ls -A'), ('l', 'ls -CF'), ('update', 'sudo apt update && sudo apt upgrade -y'), # Example Debian/Ubuntu ('ports', 'ss -tulnp') ] # Dictionary representing a simple software inventory {package_name: version} # Dictionary is good for looking up versions by package name. software_inventory = { "python": "3.10.6", # Example version, yours might differ "pip": "23.0.1", "bash": "5.1.16", "git": "2.34.1", "docker": "24.0.2" } # --- Functions for Operations --- def display_aliases(): """Prints the list of favorite aliases, nicely formatted.""" print("\n--- Your Favorite Aliases ---") if not favorite_aliases: print(" No favorite aliases defined yet.") return print(" Alias Command") print(" ----------- ------------------------------------") # Iterate through the list of tuples for alias_tuple in favorite_aliases: # Unpack the tuple alias_name, command = alias_tuple # Print formatted string, aligning columns # :<11 means left-align in a field of 11 characters print(f" {alias_name:<11} '{command}'") print("-----------------------------") def display_software_inventory(): """Prints the software inventory dictionary, sorted by package name.""" print("\n--- Simple Software Inventory ---") if not software_inventory: print(" Inventory is empty.") return print(" Package Version") print(" --------------- ----------") # Iterate through sorted keys for consistent output order for package_name in sorted(software_inventory.keys()): version = software_inventory[package_name] print(f" {package_name:<15} {version}") print("-----------------------------") def add_software(): """Prompts user to add or update a package and version in the inventory.""" print("\n--- Add/Update Software ---") package_name = input("Enter package name: ").strip().lower() # Use lowercase keys if not package_name: print("Package name cannot be empty.") return current_version = software_inventory.get(package_name) if current_version: print(f"'{package_name}' already exists with version '{current_version}'.") overwrite = input("Overwrite with new version? (y/N): ").strip().lower() if overwrite != 'y': print("Operation cancelled.") return package_version = input(f"Enter version for '{package_name}': ").strip() if not package_version: print("Package version cannot be empty.") return # Add or update the dictionary entry software_inventory[package_name] = package_version print(f"Inventory updated: '{package_name}': '{package_version}'") def remove_software(): """Prompts user to remove a package from the inventory.""" print("\n--- Remove Software ---") package_name = input("Enter package name to remove: ").strip().lower() # Use pop() which handles KeyError safely if key doesn't exist # and returns the removed value (or None if not found) removed_version = software_inventory.pop(package_name, None) if removed_version is not None: print(f"Package '{package_name}' (version '{removed_version}') removed successfully.") else: print(f"Package '{package_name}' not found in inventory.") # --- Main Application Loop --- def display_menu(): """Prints the main menu options.""" print("\n===== Linux Manager Menu =====") print(" 1. View Favorite Aliases") print(" 2. View Software Inventory") print(" 3. Add/Update Software") print(" 4. Remove Software") print(" 5. Exit") print("==============================") print("Welcome to Linux Manager!") while True: # Loop indefinitely until user chooses to exit display_menu() choice = input("Enter your choice (1-5): ").strip() if choice == '1': display_aliases() elif choice == '2': display_software_inventory() elif choice == '3': add_software() elif choice == '4': remove_software() elif choice == '5': print("Exiting Linux Manager. Goodbye!") break # Exit the while loop else: # Handle invalid input print("Invalid choice. Please enter a number between 1 and 5.") # Pause briefly before showing menu again (optional) # input("\nPress Enter to continue...")
-
Save and Exit: Save the file and exit the editor.
-
Run the Script:
- Execute the script:
- Interact with the Menu:
- Choose option
1
to view the predefined list of aliases. Observe the formatting. - Choose option
2
to view the initial software inventory. Notice it's sorted alphabetically by package name. - Choose option
3
to add new software. Enter a package name (e.g.,nano
) and a version (e.g.,6.2
). - Choose
2
again to see the updated inventory includingnano
. - Choose
3
again and enterpython
. Observe the prompt asking if you want to overwrite the existing version. Entery
and provide a new version (e.g.,3.10.7
). View the inventory again. - Choose
4
to remove software. Enter a package name that exists (e.g.,docker
). - Choose
2
again to verifydocker
has been removed. - Choose
4
again and try removing a package that doesn't exist (e.g.,firefox
). Observe the "not found" message. - Choose option
5
to exit the script.
- Choose option
Code Explanation:
favorite_aliases
: Alist
where each element is atuple
. Tuples are used because an alias definition (name and command) is typically fixed. The list allows maintaining an ordered collection.software_inventory
: Adict
where keys are package names (strings, converted to lowercase for consistency) and values are version strings. Dictionaries provide efficient lookup by package name.display_aliases()
: Iterates through the list of tuples. It unpacks each tuple intoalias_name
andcommand
. F-strings with formatting specifiers (:<11
) are used to align the output neatly.display_software_inventory()
: Iterates through the dictionary's keys after sorting them (sorted(software_inventory.keys())
) to ensure the output order is consistent. It then retrieves the version using the key and prints formatted output.add_software()
: Gets input, uses.strip().lower()
for the package name. Usessoftware_inventory.get(package_name)
to check if the package exists before prompting for overwrite. Updates or adds the entry usingsoftware_inventory[package_name] = package_version
.remove_software()
: Gets input, uses.strip().lower()
. Usessoftware_inventory.pop(package_name, None)
.pop()
attempts to remove the key; if the key exists, it removes the item and returns the value; if the key doesn't exist, it returns the default value (None
in this case) instead of raising aKeyError
, making the code simpler.- Main Loop (
while True
): Creates a command-line menu usingprint
statements inside awhile
loop. It prompts for user choice, andif/elif/else
statements call the appropriate function based on the input. The loop breaks when the user chooses the exit option.
Conclusion: You have effectively used Python's fundamental data structures: lists (of tuples) for ordered, relatively fixed data (aliases) and dictionaries for efficient key-based lookups and modifications (software inventory). You practiced iterating through these structures, accessing elements/values, and modifying them based on user input, building a simple but practical command-line management tool.
4. Control Flow Conditional Statements and Loops
Control flow statements are the building blocks that allow your programs to make decisions and repeat actions, moving beyond simple linear execution. Conditional statements (if
, elif
, else
) execute code based on whether conditions are true or false, while loops (for
, while
) execute blocks of code multiple times.
Conditional Statements (if
, elif
, else
)
These statements allow your program to choose different paths of execution based on the evaluation of boolean expressions (conditions).
if
: Executes a block of code only if its condition evaluates toTrue
.elif
(else if): Optional. Follows anif
or anotherelif
. Its condition is checked only if all precedingif
/elif
conditions wereFalse
. If its condition isTrue
, its block is executed, and the rest of theelif
/else
chain is skipped. You can have zero or moreelif
statements.else
: Optional. Must be the last clause in the structure. Its block is executed only if all precedingif
andelif
conditions evaluated toFalse
.
Syntax:
# Basic structure
if condition1:
# Code block 1: Executes if condition1 is True
# Indentation (4 spaces) is mandatory
print("Condition 1 met")
elif condition2:
# Code block 2: Executes if condition1 is False AND condition2 is True
print("Condition 1 failed, but Condition 2 met")
elif condition3:
# Code block 3: Executes if 1 & 2 are False AND condition3 is True
print("Conditions 1 and 2 failed, but Condition 3 met")
else:
# Code block 4: Executes if conditions 1, 2, AND 3 are all False
print("None of the conditions were met")
# This line executes after the entire if/elif/else structure completes
print("Conditional check finished.")
Conditions and Operators:
Conditions are typically formed using:
- Comparison Operators:
==
: Equal to!=
: Not equal to<
: Less than>
: Greater than<=
: Less than or equal to>=
: Greater than or equal to
- Logical Operators:
and
:True
only if both sides areTrue
. (condition1 and condition2
)or
:True
if at least one side isTrue
. (condition1 or condition2
)not
: Inverts the boolean value (True
becomesFalse
,False
becomesTrue
). (not condition1
)
- Membership Operators:
in
:True
if an item exists within a sequence (list, tuple, string, dict keys). (item in sequence
)not in
:True
if an item does not exist within a sequence. (item not in sequence
)
- Identity Operators:
is
:True
if two variables refer to the exact same object in memory.is not
:True
if two variables refer to different objects. (Often used withNone
:if my_var is None:
)
Truthiness: In Python, values other than explicit True
/False
also have a boolean context, known as "truthiness":
- Considered
False
:None
,False
, zero of any numeric type (0
,0.0
), empty sequences (''
,[]
,()
,{}
), empty mappings. - Considered
True
: Almost everything else (non-zero numbers, non-empty sequences/mappings,True
).
# Example combining operators and truthiness
cpu_load = 0.75
memory_percent = 65
critical_processes = ["database", "monitor"]
active_user = "root"
if (cpu_load > 0.9 or memory_percent > 85) and active_user != "root":
print("Warning: High load on system, and not root user!")
elif "database" in critical_processes and cpu_load > 0.5:
print("Warning: Database process might be under high load.")
elif not critical_processes: # Checks if the list is empty (evaluates to False if empty)
print("Info: No critical processes listed.")
else:
print("System status seems okay.")
# Check for None using 'is'
result = None
if result is None:
print("Result is currently None.")
for
Loops
for
loops are used for iterating over a sequence (like a list, tuple, dictionary, string) or any other iterable object (objects that can return their items one at a time). The loop executes its code block once for each item in the sequence.
Syntax:
for item_variable in iterable_object:
# Code block executed for each item
# 'item_variable' takes the value of the current item in each iteration
print(f"Processing item: {item_variable}")
# Perform actions with item_variable
# This code runs after the loop finishes all iterations
print("For loop complete.")
Common Iteration Targets:
- Lists/Tuples: Iterates through elements in order.
- Strings: Iterates through characters.
range()
: Generates a sequence of numbers. Very common for looping a specific number of times.range(stop)
: Numbers from0
up to (but not including)stop
.range(start, stop)
: Numbers fromstart
up to (but not including)stop
.range(start, stop, step)
: Numbers fromstart
up tostop
, incrementing bystep
.
- Dictionaries:
- Default iteration is over keys.
- Iterate over values:
for value in my_dict.values(): ...
- Iterate over key-value pairs (items):
for key, value in my_dict.items(): ...
(Most common)
enumerate()
: Get both the index and the item during iteration.
while
Loops
while
loops repeatedly execute a block of code as long as a given condition remains True
. The condition is checked before each iteration.
Syntax:
while condition:
# Code block executed as long as 'condition' is True
print("Loop iteration running...")
# IMPORTANT: Something inside the loop must eventually make the condition False
# Otherwise, you'll have an infinite loop!
# Example: Increment a counter, change a flag, read input
print("While loop finished.")
Example:
# Countdown example
count = 5
print("Starting countdown:")
while count > 0:
print(f"{count}...")
count = count - 1 # Or count -= 1
# This decrement eventually makes 'count > 0' False
print("Blast off!")
# Loop until user enters 'quit'
print("\nEnter 'quit' to exit.")
command_input = ""
while command_input.lower() != "quit":
command_input = input("Enter command: ")
if command_input.lower() != "quit":
print(f"Executing mock command: {command_input}")
# This loop relies on user input to eventually become False
print("Exited command loop.")
Infinite Loops: Be careful! If the while
condition never becomes False
, the loop runs forever. You'll usually need to interrupt the program manually (e.g., Ctrl+C
in the terminal).
# Example of an infinite loop (don't run without knowing how to stop!)
# while True:
# print("This will print forever!")
# # Needs a 'break' statement triggered by some condition to exit
Loop Control Statements (break
and continue
)
These statements alter the normal execution flow within loops (for
or while
).
break
: Immediately terminates the innermost loop it's contained within. Execution jumps to the first statement after the loop. Useful for stopping early when a condition is met or an item is found.continue
: Skips the rest of the current iteration of the innermost loop and proceeds directly to the next iteration. Useful for skipping processing for certain items without exiting the entire loop.
# Example with break: Find the first negative number
numbers = [10, 25, 0, 18, -5, 30, -2]
print("\nSearching for first negative number:")
found_negative = None
for num in numbers:
print(f"Checking {num}...")
if num < 0:
print("Found a negative number!")
found_negative = num
break # Exit the loop immediately
# This is skipped once break is hit
print("Not negative, continuing search...")
if found_negative is not None:
print(f"The first negative number was: {found_negative}")
else:
print("No negative numbers found.")
# Example with continue: Process only even numbers
print("\nProcessing only even numbers:")
for num in numbers:
if num % 2 != 0: # Check if number is odd
print(f"Skipping odd number: {num}")
continue # Skip the rest of this iteration, go to the next number
# This code only runs for even numbers
print(f"Processing even number: {num}")
# Example in while loop
print("\nWhile loop with break/continue:")
attempts = 0
max_attempts = 5
while attempts < max_attempts:
attempts += 1
user_input = input(f"Attempt {attempts}/{max_attempts}: Enter 'process' or 'skip' or 'exit': ")
user_input = user_input.lower()
if user_input == "skip":
print("Skipping this attempt.")
continue # Go directly to the next iteration (next input prompt)
elif user_input == "exit":
print("Exiting loop early.")
break # Terminate the while loop
elif user_input == "process":
print("Processing data for this attempt...")
# ... do processing ...
else:
print("Invalid input.")
print(f"Loop finished after {attempts} attempt(s).")
Nested Loops
You can place one loop inside another. The inner loop will complete all of its iterations for each single iteration of the outer loop.
print("\nNested loop example:")
for i in range(1, 4): # Outer loop: 1, 2, 3
print(f"Outer loop (i = {i}):")
for char_code in range(65, 68): # Inner loop: 65, 66, 67 (ASCII for A, B, C)
char = chr(char_code) # Convert code to character
# This inner block executes 3 * 3 = 9 times
print(f" Inner loop (char = {char})")
Control flow statements (if
, for
, while
, break
, continue
) are essential for creating programs that can react to different situations and handle repetitive tasks efficiently.
Workshop Control Flow
This workshop implements a classic number guessing game. It utilizes a while
loop to control the number of attempts, if
/elif
/else
statements to compare the guess with a secret number and provide feedback, input()
for user interaction, type conversion with error handling (try
/except
), and break
to exit the loop when the user guesses correctly.
Goal: Build an interactive game using core control flow structures (while
, if
/elif
/else
, break
, continue
) and input/output operations.
Project: Number Guessing Game
Steps:
-
Set Up:
- Navigate to your project directory in the terminal.
- Ensure your virtual environment is active.
- Create a new Python file named
guessing_game.py
(nano guessing_game.py
).
-
Write the Code: Enter the following Python code. Pay close attention to indentation and comments.
# File: guessing_game.py import random # Import the 'random' module to generate the secret number # --- Game Configuration --- LOWER_BOUND = 1 UPPER_BOUND = 100 MAX_ATTEMPTS = 7 # --- Game Setup --- # Generate the secret random number (inclusive of bounds) secret_number = random.randint(LOWER_BOUND, UPPER_BOUND) attempts_made = 0 guessed_correctly = False # Flag to track if the user won print("--- Welcome to the Number Guessing Game! ---") print(f"I've picked a secret number between {LOWER_BOUND} and {UPPER_BOUND}.") print(f"You have {MAX_ATTEMPTS} attempts to guess it.") print("--------------------------------------------") # --- Main Game Loop (while loop) --- # Loop continues as long as attempts are remaining AND the number hasn't been guessed while attempts_made < MAX_ATTEMPTS: remaining_attempts = MAX_ATTEMPTS - attempts_made print(f"\nAttempt {attempts_made + 1} of {MAX_ATTEMPTS} (Remaining: {remaining_attempts})") # --- Get and Validate User Input --- guess_str = input(f"Enter your guess: ") # Use try-except to handle non-numeric input gracefully try: guess_int = int(guess_str) # Convert input string to integer except ValueError: print(" Invalid input! Please enter a whole number.") # Use 'continue' to skip the rest of this loop iteration and ask for input again continue # Check if the guess is within the allowed bounds if guess_int < LOWER_BOUND or guess_int > UPPER_BOUND: print(f" Your guess ({guess_int}) is outside the allowed range ({LOWER_BOUND}-{UPPER_BOUND}). Try again.") # Use 'continue' to skip the rest of this iteration continue # If input is valid (numeric and within bounds), increment the attempt counter attempts_made += 1 # --- Compare Guess with Secret Number (if/elif/else) --- if guess_int == secret_number: print(f" Congratulations! You guessed the secret number ({secret_number}) correctly!") guessed_correctly = True # Set the flag indicating a win # Use 'break' to exit the 'while' loop immediately since the game is won break elif guess_int < secret_number: print(" Too low! Try guessing higher.") else: # guess_int > secret_number print(" Too high! Try guessing lower.") # End of if/elif/else comparison block # --- After the Loop (Game Over Check) --- # This code runs after the while loop finishes, either because 'break' was hit (win) # or because the condition 'attempts_made < MAX_ATTEMPTS' became False (ran out of attempts). # Check if the player ran out of attempts *and* didn't guess correctly if not guessed_correctly: # Equivalent to 'if guessed_correctly == False:' print("\n--- Game Over! ---") print(f"Sorry, you ran out of attempts. The secret number was {secret_number}.") print("------------------") print("\nThanks for playing!")
-
Save and Exit: Save the file and exit your text editor.
-
Run the Script:
- Make sure your virtual environment is active.
- Execute the script from your terminal:
- Play the Game:
- The script will display the welcome message, range, and number of attempts.
- Enter your guesses when prompted.
- Pay attention to the feedback ("Too low!", "Too high!", "Congratulations!").
- Use the feedback to refine your next guess.
- Continue playing until you either guess the number correctly or you use all 7 attempts.
-
Test Edge Cases and Error Handling:
- Invalid Input (Non-numeric): Run the game again. When prompted for a guess, enter text like "fifty" or "hello". Observe the "Invalid input! Please enter a whole number." message. Notice that this attempt doesn't count against your
MAX_ATTEMPTS
because of thecontinue
statement. - Invalid Input (Out of Range): Enter a number outside the 1-100 range (e.g.,
0
,150
). Observe the "guess is outside the allowed range" message. This attempt also shouldn't count due tocontinue
. - Winning: Play until you guess the correct number. Observe the congratulations message and how the
break
statement ends the game immediately, even if you had attempts remaining. - Losing: Play deliberately making poor guesses so that you use all
MAX_ATTEMPTS
without finding the number. Observe the "Game Over!" message revealing the secret number after the loop finishes naturally.
- Invalid Input (Non-numeric): Run the game again. When prompted for a guess, enter text like "fifty" or "hello". Observe the "Invalid input! Please enter a whole number." message. Notice that this attempt doesn't count against your
Code Explanation Recap:
import random
: Usedrandom.randint()
to pick the secret number.while attempts_made < MAX_ATTEMPTS
: The main game loop condition. Ensures the loop runs at mostMAX_ATTEMPTS
times.input()
: Got the user's guess as a string.try...except ValueError
: Handled potential errors when converting the input string to an integer usingint()
. If the conversion failed, it printed an error and usedcontinue
to restart the current loop iteration without processing the invalid guess or incrementingattempts_made
.if guess_int < LOWER_BOUND or guess_int > UPPER_BOUND
: Validated if the guess was within the expected range. Usedcontinue
to skip processing invalid range inputs.attempts_made += 1
: Only incremented after validating the input was a number within the correct range.if/elif/else
: Compared the valid guess (guess_int
) to thesecret_number
and provided the core game feedback.guessed_correctly = True
andbreak
: When the guess was correct, a flag was set, andbreak
was used to exit thewhile
loop immediately.if not guessed_correctly:
: After the loop ended, this condition checked if the loop finished because ofbreak
(meaningguessed_correctly
isTrue
) or because the attempt limit was reached (meaningguessed_correctly
is stillFalse
). This determined whether to show the "Game Over" message.
Conclusion: You have successfully implemented an interactive number guessing game, applying fundamental Python control flow concepts. You used a while
loop for repetition based on a condition (attempts remaining), if
/elif
/else
for decision-making (comparing guesses), try
/except
for robust input validation, continue
to handle invalid input gracefully within the loop, and break
to exit the loop upon successful completion.
5. Functions Defining and Using Reusable Code
As programs become more complex than simple scripts, writing all the code in one long sequence becomes unmanageable, repetitive, and difficult to maintain. Functions are a fundamental concept in programming that allow you to group a block of related code, give it a name, and execute that block whenever needed by "calling" its name. This promotes code reuse, organization (modularity), and readability.
Defining Functions (def
)
You define a function using the def
keyword, followed by the function name, parentheses ()
containing any parameters the function accepts, and a colon :
. The code block that forms the function's body must be indented.
Syntax:
def function_name(parameter1, parameter2, ...):
"""
Optional: Docstring (Documentation String)
This triple-quoted string is the first statement in the function.
It explains what the function does, its parameters, and what it returns.
It's used by help() and documentation tools.
"""
# --- Function Body ---
# Indented code block containing the function's logic
statement1
statement2
# Access parameters (parameter1, parameter2) here
result = parameter1 + parameter2 # Example operation
# Optional: Return a value using the 'return' statement
# If 'return' is omitted, or 'return None' is used,
# the function implicitly returns None.
return result
# Code after the return statement in a given path will not be executed
Components:
def
: Keyword indicating the start of a function definition.function_name
: A descriptive name following the same naming rules and conventions as variables (snake_case
recommended).()
: Parentheses containing the list of parameters. Empty if the function takes no input.parameter1, parameter2, ...
: Local variables within the function that receive values (arguments) when the function is called.:
: Colon marking the end of the function header.- Docstring (
"""..."""
): A string literal explaining the function's purpose. Highly recommended for good practice. - Function Body: The indented block of code that performs the function's task.
return result_value
: Optional statement to send a value back to the part of the program that called the function. A function can have multiplereturn
statements (e.g., in differentif
branches).
Examples:
# Function with no parameters and no explicit return (returns None)
def print_separator():
"""Prints a separator line to the console."""
print("-" * 40)
# Function with parameters and a return value
def calculate_area(length, width):
"""Calculates the area of a rectangle."""
if length < 0 or width < 0:
print("Warning: Length and width should be non-negative.")
return None # Return None for invalid input
area = length * width
return area # Return the calculated area
# Function with a boolean return value
def is_file_accessible(filepath):
"""Checks if a file exists and is readable."""
import os # Import inside function if only needed here (or at top level)
return os.path.exists(filepath) and os.access(filepath, os.R_OK)
Calling Functions
Once a function is defined, you can execute it by "calling" it using its name followed by parentheses ()
. If the function expects parameters, you must provide corresponding values, called arguments, inside the parentheses.
# Calling the functions defined above
print("Starting program...")
print_separator() # Call function with no arguments
# Call function with arguments, store the returned value
rect_area = calculate_area(10, 5)
if rect_area is not None:
print(f"The calculated area is: {rect_area}") # Output: 50
invalid_area = calculate_area(10, -5) # Calls with invalid width
print(f"Result for invalid area: {invalid_area}") # Output: None (after warning)
print_separator()
file_to_check = "/etc/hosts" # A common readable file on Linux
if is_file_accessible(file_to_check):
print(f"File '{file_to_check}' exists and is readable.")
else:
print(f"File '{file_to_check}' does not exist or is not readable.")
file_to_check_bad = "/root/secret.key" # Usually not readable by normal users
if is_file_accessible(file_to_check_bad):
print(f"File '{file_to_check_bad}' exists and is readable.")
else:
print(f"File '{file_to_check_bad}' does not exist or is not readable.")
print_separator()
print("Program finished.")
Parameters vs. Arguments
- Parameters: Variables listed in the function definition (
length
,width
incalculate_area
). They are placeholders within the function. - Arguments: The actual values passed to the function when it is called (
10
,5
incalculate_area(10, 5)
).
Types of Arguments:
- Positional Arguments: Matched to parameters based on their order.
calculate_area(10, 5)
assigns10
tolength
and5
towidth
. Order matters. - Keyword Arguments: Specified using
parameter_name=value
in the function call. Order doesn't matter, and they improve readability, especially for functions with many parameters.# Using keyword arguments - order doesn't matter rect_area_kw = calculate_area(width=5, length=10) print(f"Area using keyword args: {rect_area_kw}") # Can mix positional and keyword, but positional must come first rect_area_mix = calculate_area(10, width=5) # calculate_area(length=10, 5) # SyntaxError: positional argument follows keyword argument
- Default Parameter Values: You can assign default values to parameters in the function definition. If an argument for that parameter is not provided during the call, the default value is used. Default parameters must come after non-default parameters in the function definition.
Important Note on Default Values: Default values are evaluated once when the function is defined, not each time it's called. This is particularly important for mutable defaults (like lists or dictionaries). Avoid using mutable default arguments unless you specifically understand and intend the behavior (sharing the same object across calls). Use
def greet(name, greeting="Hello"): # greeting has a default value """Greets a person with an optional greeting.""" print(f"{greeting}, {name}!") greet("Alice") # Uses default greeting: Hello, Alice! greet("Bob", "Good morning") # Overrides default: Good morning, Bob! greet(greeting="Hi", name="Charlie") # Keyword args work too: Hi, Charlie!
None
as the default and create the mutable object inside the function if needed.# Problematic example: Mutable default # def add_item(item, my_list=[]): # my_list.append(item) # return my_list # print(add_item(1)) # Output: [1] # print(add_item(2)) # Output: [1, 2] (List is shared across calls!) # Corrected version: def add_item_safe(item, my_list=None): """Safely adds an item to a list, creating a new list if none provided.""" if my_list is None: my_list = [] # Create a new list inside the function call my_list.append(item) return my_list print(add_item_safe(1)) # Output: [1] print(add_item_safe(2)) # Output: [2] (Each call gets a new list by default) list_a = ['a'] print(add_item_safe(3, list_a)) # Output: ['a', 3] (Works correctly when list is passed) print(list_a) # Output: ['a', 3] (Original list is modified as expected)
The return
Statement
The return
statement exits a function and optionally sends a value (or object) back to the caller.
return value
: Exits the function and returnsvalue
.return
: Exits the function and returnsNone
.- If the end of the function body is reached without encountering a
return
statement, the function implicitly returnsNone
. - A function can return any type of Python object (int, float, str, list, dict, tuple, functions, classes, instances,
None
). - To return multiple values, simply list them separated by commas; Python automatically packs them into a tuple.
def get_system_status():
"""Returns multiple status values as a tuple."""
load = 1.5 # Simulate getting load
mem_free_gb = 4.2
status_ok = True
# Python automatically creates a tuple here
return status_ok, load, mem_free_gb
# Call and unpack the returned tuple
is_ok, current_load, free_mem = get_system_status()
print(f"System OK: {is_ok}, Load: {current_load}, Free Mem: {free_mem} GB")
# Or receive the whole tuple
status_tuple = get_system_status()
print(f"Status tuple: {status_tuple}")
Variable Scope (LEGB Rule)
Scope defines the region of a program where a particular variable name can be accessed. Python uses the LEGB rule to resolve names (find the value associated with a variable name):
- L (Local): Names assigned within the current function (
def
orlambda
). This includes function parameters. These names are only accessible inside that function. - E (Enclosing function locals): Names in the local scopes of any enclosing functions (if the current function is nested inside another function), searched from the innermost scope outwards.
- G (Global): Names assigned at the top level of a module file, or declared
global
within a function. Accessible throughout the module. - B (Built-in): Pre-assigned names in Python's built-in modules (e.g.,
print
,len
,str
,list
,int
,True
,False
,None
,Exception
). Always available.
Python searches for a name in this order: L -> E -> G -> B. The first match found is used.
x = "I am global" # Global variable
def outer_func():
y = "I am outer local (enclosing for inner)" # Enclosing function local
# Can access global 'x'
print(f"[outer_func] Accessing x: {x}")
print(f"[outer_func] Accessing y: {y}")
def inner_func():
z = "I am inner local" # Local variable
# Can access enclosing 'y' and global 'x'
print(f"[inner_func] Accessing x: {x}")
print(f"[inner_func] Accessing y: {y}")
print(f"[inner_func] Accessing z: {z}")
inner_func() # Call the nested function
# Cannot access 'z' here:
# print(f"[outer_func] Accessing z: {z}") # NameError: name 'z' is not defined
outer_func()
# Cannot access 'y' or 'z' here:
# print(f"[Global scope] Accessing y: {y}") # NameError
# print(f"[Global scope] Accessing z: {z}") # NameError
# --- Modifying Global Variables ---
# Generally, functions should avoid modifying global variables directly.
# It's better to pass values in as arguments and return results.
# However, if necessary, use the 'global' keyword.
counter = 0 # Global counter
def increment_global_counter():
global counter # Declare that we intend to modify the global 'counter'
counter += 1
print(f"[increment] Counter is now: {counter}")
increment_global_counter()
increment_global_counter()
print(f"[Global scope] Final counter: {counter}") # Shows the modified global value
# The 'nonlocal' keyword (less common) is used in nested functions
# to modify variables in the *enclosing* (but not global) scope.
Understanding scope is crucial for avoiding NameError
exceptions and managing variable lifetimes correctly.
Docstrings
As mentioned, docstrings ("""Docstring goes here"""
) are string literals appearing as the first statement in a module, function, class, or method. They are Python's standard way to document code.
- Purpose: Explain what the code does (not how), its parameters (
Args:
), what it returns (Returns:
), and any exceptions it might raise (Raises:
). - Access: Automatically attached to the object's
__doc__
attribute. Tools like the built-inhelp()
function and documentation generators (Sphinx, MkDocs with extensions) rely on docstrings. - Convention: First line is a short summary. Followed by a blank line, then a more detailed explanation. Use standard sections like
Args:
,Returns:
,Raises:
.
def process_data(data, *, threshold=0.5, raise_on_error=False):
"""Processes numerical data, applying a threshold.
This function takes a list of numbers, filters them based on a
threshold, and calculates their average.
Args:
data (list): A list of numbers (int or float) to process.
threshold (float, optional): The minimum value for a number to be
included. Defaults to 0.5. Must be keyword-only.
raise_on_error (bool, optional): If True, raises ValueError on invalid data.
If False, prints warning and skips invalid data.
Defaults to False. Must be keyword-only.
Returns:
float or None: The average of the numbers in 'data' that are greater
than or equal to 'threshold'. Returns None if no valid
numbers meet the threshold or if input 'data' is empty/invalid.
Raises:
TypeError: If 'data' is not a list.
ValueError: If 'raise_on_error' is True and 'data' contains non-numeric items.
"""
if not isinstance(data, list):
raise TypeError("Input 'data' must be a list.")
if not data: # Handle empty list
return None
valid_numbers = []
for item in data:
try:
num = float(item) # Attempt conversion
if num >= threshold:
valid_numbers.append(num)
except (ValueError, TypeError) as e:
if raise_on_error:
raise ValueError(f"Invalid item '{item}' in data: {e}") from e
else:
print(f"Warning: Skipping invalid item '{item}' in data.")
if not valid_numbers:
return None
else:
return sum(valid_numbers) / len(valid_numbers)
# Accessing the docstring
print("\n--- Docstring for process_data ---")
print(process_data.__doc__)
# Using help()
# help(process_data) # Run this in REPL or uncomment to see formatted help
*,
in the function signature forces threshold
and raise_on_error
to be passed as keyword arguments, improving clarity.)
Lambda Functions (Anonymous Functions)
Lambda functions provide a concise syntax for creating small, anonymous functions (functions without a formal def
name). They are defined using the lambda
keyword.
- Syntax:
lambda arguments: expression
- Characteristics:
- Can take any number of arguments.
- Can only have one expression (the value that is calculated and returned).
- Cannot contain complex statements like assignments (inside the lambda body),
if
/else
blocks (though conditional expressions are allowed), loops, ortry
/except
. - Often used where a simple function object is needed briefly, like arguments to functions like
sorted()
,map()
,filter()
, or in GUI callbacks.
# Equivalent regular function:
# def square(x):
# return x * x
# Lambda function for squaring a number
square_lambda = lambda x: x * x
print(f"Square of 6 (lambda): {square_lambda(6)}") # Output: 36
# Lambda function for adding two numbers
add_lambda = lambda a, b: a + b
print(f"Sum of 10 and 7 (lambda): {add_lambda(10, 7)}") # Output: 17
# Lambda with conditional expression (value_if_true if condition else value_if_false)
max_lambda = lambda x, y: x if x > y else y
print(f"Max of 5, 9 (lambda): {max_lambda(5, 9)}") # Output: 9
# Using lambda with sorted() for custom sorting keys
# Sort a list of tuples based on the second element (index 1)
points = [(1, 9), (5, 3), (-2, 8)]
points_sorted_by_y = sorted(points, key=lambda point: point[1])
print(f"Points sorted by y-coordinate: {points_sorted_by_y}")
# Output: [(5, 3), (-2, 8), (1, 9)]
# Using lambda with filter() to select elements meeting a condition
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
even_numbers = list(filter(lambda n: n % 2 == 0, numbers)) # Get only even numbers
print(f"Even numbers (filter + lambda): {even_numbers}") # Output: [2, 4, 6, 8, 10]
# Using lambda with map() to apply a function to each element
squared_numbers = list(map(lambda n: n * n, numbers))
print(f"Squared numbers (map + lambda): {squared_numbers}")
# Output: [1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
While powerful for concise operations, overly complex lambdas can harm readability. For anything more than a simple expression, a regular def
function is usually clearer.
Functions are a cornerstone of writing organized, reusable, and maintainable Python code.
Workshop Functions
This workshop involves refactoring the Number Guessing Game from the previous workshop (guessing_game.py
) to use functions. This will improve its structure, make it easier to read, and demonstrate the benefits of modularity.
Goal: Practice defining and calling functions, passing arguments, returning values, using docstrings, and organizing existing procedural code into logical functional units.
Project: Refactored Number Guessing Game
Steps:
-
Set Up:
- Navigate to the directory containing your
guessing_game.py
file. - Ensure your virtual environment is active.
- Create a copy of the game file to work on the refactored version without losing the original:
- Open the new file,
guessing_game_refactored.py
, in your text editor (nano guessing_game_refactored.py
).
- Navigate to the directory containing your
-
Identify Code Blocks for Functions: Examine the existing code in
guessing_game_refactored.py
. Identify logical sections that perform distinct tasks and could be encapsulated within functions. Good candidates include:- Displaying the welcome message and rules.
- Getting and validating the user's guess input.
- Comparing the guess to the secret number and providing feedback.
- Displaying the final win/loss message.
- The main game loop logic itself.
-
Refactor the Code Using Functions: Modify
guessing_game_refactored.py
according to the structure below. Add docstrings to explain each function.# File: guessing_game_refactored.py import random # --- Constants --- LOWER_BOUND = 1 UPPER_BOUND = 100 MAX_ATTEMPTS = 7 # --- Function Definitions --- def display_welcome_message(): """Prints the welcome message and game rules.""" print("--- Welcome to the Number Guessing Game! ---") print(f"I've picked a secret number between {LOWER_BOUND} and {UPPER_BOUND}.") print(f"You have {MAX_ATTEMPTS} attempts to guess it.") print("--------------------------------------------") def get_validated_guess(attempt_num, max_attempts): """ Prompts the user for a guess, validates it (numeric, within bounds). Args: attempt_num (int): The current attempt number (e.g., 1, 2, ...). max_attempts (int): The total allowed attempts. Returns: int or None: The validated integer guess if valid, otherwise None. Returning None signals invalid input occurred. """ remaining = max_attempts - (attempt_num - 1) prompt = f"\nAttempt {attempt_num} of {max_attempts} (Remaining: {remaining})\nEnter your guess: " guess_str = input(prompt) try: guess_int = int(guess_str) if LOWER_BOUND <= guess_int <= UPPER_BOUND: return guess_int # Valid guess, return the integer else: print(f" Warning: Your guess ({guess_int}) is outside the allowed range ({LOWER_BOUND}-{UPPER_BOUND}).") return None # Signal invalid range except ValueError: print(" Warning: Invalid input! Please enter a whole number.") return None # Signal invalid type def check_guess_and_give_feedback(guess, secret): """ Compares the guess with the secret number and prints feedback. Args: guess (int): The user's validated integer guess. secret (int): The secret number. Returns: bool: True if the guess is correct, False otherwise. """ if guess == secret: print(f" Congratulations! You guessed the secret number ({secret}) correctly!") return True elif guess < secret: print(" Too low! Try guessing higher.") return False else: # guess > secret print(" Too high! Try guessing lower.") return False def display_loss_message(secret): """Prints the game over message when the user runs out of attempts.""" print("\n--- Game Over! ---") print(f"Sorry, you ran out of attempts. The secret number was {secret}.") print("------------------") def play_game(): """Runs the main logic for a single round of the guessing game.""" display_welcome_message() secret_number = random.randint(LOWER_BOUND, UPPER_BOUND) attempts_made = 0 guessed_correctly = False # Main game loop - now uses function calls inside while attempts_made < MAX_ATTEMPTS: current_attempt_num = attempts_made + 1 guess = get_validated_guess(current_attempt_num, MAX_ATTEMPTS) # If input was invalid, get_validated_guess returns None, skip turn if guess is None: continue # Skips incrementing attempts_made for invalid input attempts_made += 1 # Increment only for valid attempts guessed_correctly = check_guess_and_give_feedback(guess, secret_number) if guessed_correctly: break # Exit loop if guess was correct # After the loop, check if the game was lost if not guessed_correctly: display_loss_message(secret_number) print("\nThanks for playing!") # --- Main Execution Guard --- # This standard Python construct ensures that play_game() is called only # when this script is executed directly (not when imported as a module). if __name__ == "__main__": play_game()
-
Save and Exit: Save the changes to
guessing_game_refactored.py
. -
Run the Refactored Script:
- Execute the refactored script:
- Play the Game: The game should behave exactly the same as the original version from the user's perspective. Play through it once or twice to confirm.
- Test Edge Cases: Try entering invalid input (text, out-of-range numbers) again. Verify that the validation messages appear and that invalid attempts do not consume one of your allowed tries. Test winning and losing scenarios.
Code Explanation of Changes:
- Constants: Defined
LOWER_BOUND
,UPPER_BOUND
,MAX_ATTEMPTS
at the top (module level) for clarity and easy modification. display_welcome_message()
: Encapsulates the initial print statements. Clear, single purpose.get_validated_guess()
: Handles prompting the user, getting input, and performing both type validation (try-except
) and range validation. It returns the valid integer guess orNone
if the input was invalid in any way. This centralizes input handling logic.check_guess_and_give_feedback()
: Takes the validated guess and the secret number. Performs the comparison, prints the feedback ("Too low!", etc.), and returns a boolean (True
for correct,False
otherwise) to signal the result to the main loop.display_loss_message()
: Handles printing the specific message when the user runs out of attempts.play_game()
: Contains the main game flow. It calls the other functions in the correct order. Notice how thewhile
loop is now cleaner, primarily coordinating calls toget_validated_guess
andcheck_guess_and_give_feedback
.if __name__ == "__main__":
: This idiom ensures that theplay_game()
function is called only when the script is run directly (e.g.,python guessing_game_refactored.py
). If you were toimport guessing_game_refactored
into another Python script, the game wouldn't start automatically, but you could still potentially import and use its functions if desired.
Conclusion: You have successfully refactored the guessing game script using functions. Compare the play_game()
function in this version to the main while
loop in the original script. The refactored version reads more like a high-level description of the game's steps, with the details of each step hidden within the dedicated functions. This demonstrates how functions improve code organization, readability, and maintainability. If you needed to change how input is validated or how feedback is displayed, you would only need to modify the corresponding function.
6. Modules and Packages Organizing Your Code
As projects grow beyond a single file, keeping all your functions and classes in one place becomes impractical. Python's module and package system provides a robust way to:
- Organize Code: Split your code into logical units (modules).
- Reuse Code: Import functions, classes, and variables from one module into another.
- Namespace Management: Avoid naming conflicts between different parts of your project.
- Leverage External Code: Utilize modules from Python's extensive standard library and third-party packages installed via
pip
.
Modules
Conceptually, a module is just a single Python file (.py
) containing Python definitions (functions, classes, variables) and executable statements. The filename itself (without the .py
extension) serves as the module name.
Creating a Module:
Let's create a simple module containing utility functions for working with Linux file paths. Create a file named linux_path_utils.py
:
# File: linux_path_utils.py
"""
A simple module for Linux path manipulation utilities.
"""
import os # Using the standard 'os' module internally
def get_filename(path_string):
"""Extracts the filename (basename) from a full path."""
if not isinstance(path_string, str):
return None
return os.path.basename(path_string.strip())
def get_directory(path_string):
"""Extracts the directory part from a full path."""
if not isinstance(path_string, str):
return None
return os.path.dirname(path_string.strip())
def path_exists(path_string):
"""Checks if a file or directory exists at the given path."""
if not isinstance(path_string, str):
return False
return os.path.exists(path_string.strip())
# Example variable defined in the module
DEFAULT_LOG_DIR = "/var/log/"
# Code here runs when the module is imported (usually avoid complex logic here)
print(f"[linux_path_utils] Module loaded. Default log dir: {DEFAULT_LOG_DIR}")
# Standard idiom to allow running test code only when script is executed directly
if __name__ == "__main__":
print("\n--- Testing linux_path_utils functions ---")
test_path = "/home/user/my_script.py"
print(f"Testing with path: {test_path}")
print(f"Filename: {get_filename(test_path)}")
print(f"Directory: {get_directory(test_path)}")
print(f"Exists? {path_exists(test_path)}")
print(f"Exists (/etc/passwd)? {path_exists('/etc/passwd')}")
print("--- End of tests ---")
Importing Modules
To use the code defined in one module (like linux_path_utils.py
) from another script or the REPL, you need to import it. Python provides several ways to do this:
-
import module_name
: Imports the entire module object. You access its members (functions, variables, classes) using the dot notation:module_name.member_name
. This is generally the most recommended method because it keeps namespaces separate and makes it explicit where names are coming from.(Observe the# File: main_script.py (in the same directory as linux_path_utils.py) import linux_path_utils # Import the whole module file_path = "/usr/local/bin/my_app" # Access functions and variables using module_name.member_name filename = linux_path_utils.get_filename(file_path) directory = linux_path_utils.get_directory(file_path) exists = linux_path_utils.path_exists(file_path) log_dir = linux_path_utils.DEFAULT_LOG_DIR print(f"--- Using imported module ---") print(f"Path: {file_path}") print(f"Filename: {filename}") print(f"Directory: {directory}") print(f"Exists: {exists}") print(f"Default Log Dir: {log_dir}") # Import standard library modules the same way import sys print(f"Python version: {sys.version_info.major}.{sys.version_info.minor}")
[linux_path_utils] Module loaded...
message prints only once when the module is first imported). -
from module_name import specific_member1, specific_member2, ...
: Imports only the specified names directly into the current script's namespace. You can then usespecific_member1
etc., without themodule_name.
prefix. Use this when you need only a few specific items and want shorter names, but be mindful of potential name collisions if your script defines a variable or function with the same name.# File: main_script_from.py # Import specific functions directly from linux_path_utils import get_filename, path_exists # DEFAULT_LOG_DIR is NOT imported here file_path = "/tmp/data.csv" # Call the imported functions directly fname = get_filename(file_path) does_exist = path_exists(file_path) print(f"Filename (direct import): {fname}") print(f"Exists (direct import): {does_exist}") # Cannot access non-imported members or the module itself # print(get_directory(file_path)) # NameError: name 'get_directory' is not defined # print(DEFAULT_LOG_DIR) # NameError: name 'DEFAULT_LOG_DIR' is not defined # print(linux_path_utils.DEFAULT_LOG_DIR) # NameError: name 'linux_path_utils' is not defined
-
from module_name import *
: Imports all names from the module (except those starting with an underscore_
, by convention) directly into the current namespace. This is strongly discouraged in most production code because:- It pollutes the current namespace, making it unclear where names originated.
- It significantly increases the risk of name collisions.
- It makes code harder for static analysis tools (linters) to check.
Avoid
import *
.
-
import module_name as alias
: Imports the module but gives it a shorter or more convenient alias (nickname). This is very common for modules with long names or standard abbreviations (likenumpy as np
,pandas as pd
,matplotlib.pyplot as plt
).# File: main_script_alias.py import linux_path_utils as lpu # Use 'lpu' as an alias import datetime as dt # Common alias for datetime path = "/var/spool/mail/user" fname = lpu.get_filename(path) # Use the alias print(f"Filename via alias: {fname}") print(f"Default log dir via alias: {lpu.DEFAULT_LOG_DIR}") print(f"Current time: {dt.datetime.now()}")
-
from module_name import specific_member as alias
: Imports a specific member and gives that member an alias.
Module Search Path (sys.path
):
How does Python find the linux_path_utils.py
file when you write import linux_path_utils
? It searches a list of directories known as the module search path. You can inspect this path:
Python searches these locations in order:
- The directory containing the input script (or the current directory if running interactively). This is why
main_script.py
can findlinux_path_utils.py
when they are in the same directory. - Directories listed in the
PYTHONPATH
environment variable (if set). This is an OS environment variable you can configure to add custom library locations. - Installation-dependent default paths, including the Python standard library location and the
site-packages
directory (wherepip
installs packages, often inside your active virtual environment).
Packages
As projects grow even larger, you might want to organize related modules into directories. A package is simply a directory containing Python modules and a special (often empty) file named __init__.py
. The presence of __init__.py
tells Python to treat the directory as a package, allowing you to use dotted module names.
Example Package Structure:
Imagine organizing system utilities:
sysadmin_project/
├── main_runner.py # Main script using the package
└── sysutils/ # Top-level package directory ('sysutils')
├── __init__.py # Makes 'sysutils' a package
├── network.py # Module for network utilities
├── disk.py # Module for disk utilities
└── processes/ # Sub-package directory ('processes')
├── __init__.py # Makes 'processes' a sub-package
└── management.py # Module for process management
sysutils/__init__.py
:
This file is executed when the package (sysutils
) or any of its modules are imported for the first time. It can be empty, or it can contain initialization code or define package-level attributes. It can also control what from sysutils import *
does using __all__
.
# sysutils/__init__.py
print("Initializing the 'sysutils' package...")
# Optionally make specific functions available directly from the package
# Using relative imports within the package (the '.' means current directory)
from .network import get_ip_address
from .disk import get_disk_usage
# Define __all__ to specify what 'from sysutils import *' should import
__all__ = ['get_ip_address', 'get_disk_usage', 'network', 'disk', 'processes']
sysutils/network.py
(Example content):
# sysutils/network.py
def get_ip_address(interface='eth0'):
# Placeholder implementation
print(f"Fetching IP for {interface}...")
return "192.168.1.100" # Dummy value
sysutils/disk.py
(Example content):
# sysutils/disk.py
def get_disk_usage(path='/'):
# Placeholder implementation
print(f"Fetching disk usage for {path}...")
return {'total': 100, 'used': 40, 'free': 60} # Dummy values (GB)
sysutils/processes/__init__.py
: (Can be empty)
sysutils/processes/management.py
(Example content):
# sysutils/processes/management.py
def list_processes_by_user(user='root'):
# Placeholder implementation
print(f"Listing processes for user {user}...")
return [{'pid': 1, 'name': 'systemd'}, {'pid': 500, 'name': 'sshd'}] # Dummy list
Importing from Packages (main_runner.py
):
# main_runner.py (located in sysadmin_project directory)
# Assuming sysadmin_project or its parent is effectively in sys.path
# --- Different ways to import ---
# 1. Import specific modules using dotted path
import sysutils.network
import sysutils.disk
import sysutils.processes.management as proc_mgmt # Import submodule with alias
print("\n--- Importing specific modules ---")
ip = sysutils.network.get_ip_address('eno1')
usage = sysutils.disk.get_disk_usage('/home')
procs = proc_mgmt.list_processes_by_user('student')
print(f"IP: {ip}")
print(f"Disk Usage: {usage}")
print(f"Processes: {procs}")
# 2. Import specific functions/classes from modules
from sysutils.network import get_ip_address
from sysutils.disk import get_disk_usage as disk_check # Alias specific function
from sysutils.processes.management import list_processes_by_user
print("\n--- Importing specific functions ---")
ip2 = get_ip_address('wlan0') # Direct call
usage2 = disk_check('/') # Call using alias
procs2 = list_processes_by_user() # Direct call (uses default 'root')
print(f"IP 2: {ip2}")
print(f"Disk Usage 2: {usage2}")
print(f"Processes 2: {procs2}")
# 3. Import functions exposed in package's __init__.py
import sysutils
print("\n--- Importing from package __init__ ---")
# These work because __init__.py imported them: from .network import get_ip_address
ip3 = sysutils.get_ip_address()
usage3 = sysutils.get_disk_usage()
print(f"IP 3: {ip3}")
print(f"Disk Usage 3: {usage3}")
# Accessing submodules still requires full path or separate import
# procs3 = sysutils.management.list_processes_by_user() # AttributeError unless __init__ imports management too
# Need: import sysutils.processes.management or from sysutils.processes import management
# 4. Using 'from package import *' (controlled by __all__ in __init__.py)
# from sysutils import *
# print(get_ip_address()) # Would work
# print(get_disk_usage()) # Would work
# print(list_processes_by_user()) # Would NOT work unless 'processes.management' or the function itself was in __all__
Relative Imports (within packages)
When code inside a package needs to import another module within the same package (or a sub-package), use relative imports starting with .
or ..
.
from . import sibling_module
: Importssibling_module
from the same directory.from .sibling_module import member
: Importsmember
fromsibling_module
in the same directory.from .. import parent_package_module
: Importsparent_package_module
from the parent directory (one level up).from ..parent_package_module import member
: Importsmember
from a module in the parent package.
Example: If sysutils/disk.py
needed to use something from sysutils/network.py
:
# Inside sysutils/disk.py
# from sysutils import network # Avoid absolute import if possible within package
from . import network # Correct relative import
def check_network_before_disk_op():
ip = network.get_ip_address()
print(f"Network check OK (IP: {ip}), proceeding with disk operation...")
# ... disk operation logic ...
Relative imports make your package more self-contained and less dependent on how the project is structured externally or whether the top-level package is in PYTHONPATH
. Use relative imports for intra-package imports.
The Standard Library
Python's power comes significantly from its extensive standard library – a vast collection of modules included with every Python installation, providing tools for a wide range of common tasks without needing pip install
. You should always check if the standard library solves your problem before reaching for external packages.
Key Standard Library Areas:
- Text Processing:
string
,re
(regular expressions) - Data Types:
datetime
,collections
(specialized dictionaries, lists, etc.),enum
- Numeric/Math:
math
,decimal
,random
,statistics
- File System:
os
,os.path
,shutil
(high-level file ops),pathlib
(modern object-oriented paths),glob
(filename pattern matching),tempfile
- Data Persistence/Formats:
pickle
(serializing Python objects),json
,csv
,sqlite3
(built-in database),configparser
,xml
parsers - Operating System Services:
subprocess
,sys
,argparse
(command-line arguments),logging
,getpass
,platform
- Networking:
socket
,ssl
,http
(client, server),urllib
(request, parse),ftplib
,smtplib
,email
- Concurrency:
threading
,multiprocessing
,asyncio
(asynchronous I/O) - Testing:
unittest
,doctest
- Debugging/Profiling:
pdb
(debugger),profile
,timeit
- And many more...
Explore the official Python Standard Library documentation to see the breadth of available tools.
Managing Dependencies with pip
and requirements.txt
As soon as your project uses external packages installed via pip
(within your virtual environment), you must track these dependencies to ensure your project is reproducible. The standard convention is the requirements.txt
file.
Workflow:
- Activate your project's virtual environment. (
source .venv/bin/activate
) - Install needed packages:
pip install requests flask SQLAlchemy "pandas>=1.5"
-
Generate (or update)
requirements.txt
:pip freeze
lists all packages installed in the current environment (including dependencies of your direct installs) in a formatpip install -r
can read.>
redirects this output into therequirements.txt
file, overwriting it.- The file will contain lines like:
(Exact versions and dependencies may vary). Notice the
certifi==2023.7.22 charset-normalizer==3.3.2 click==8.1.7 Flask==3.0.0 idna==3.6 itsdangerous==2.1.2 Jinja2==3.1.2 MarkupSafe==2.1.3 numpy==1.26.2 pandas==2.1.4 python-dateutil==2.8.2 pytz==2023.3.post1 requests==2.31.0 six==1.16.0 SQLAlchemy==2.0.23 tzdata==2023.3 urllib3==2.1.0 Werkzeug==3.0.1
==
pinning exact versions.
-
Add
requirements.txt
to version control (e.g.,git add requirements.txt && git commit ...
). - Exclude your virtual environment directory (e.g., add
.venv/
to.gitignore
). - Install dependencies on another machine (or clean environment):
- Clone the repository.
- Create a new virtual environment:
python3 -m venv .venv
- Activate it:
source .venv/bin/activate
- Install all dependencies from the file:
pip
will read the file and install the exact versions specified, recreating the necessary environment.
This process ensures that anyone working on the project uses the same set of dependencies, preventing "works on my machine" problems. Regularly update your requirements.txt
file (pip freeze > requirements.txt
) after installing or updating packages.
Workshop Modules and Packages
This workshop involves creating a small multi-file project that simulates checking different aspects of a Linux system (dummy checks for simplicity). We'll organize the checking logic into separate modules within a package and use a main script to import and run the checks.
Goal: Practice creating a Python package with multiple modules, using relative imports within the package, using absolute imports in a main script to access package contents, and managing dependencies (even if minimal) with requirements.txt
.
Project: Simple System Status Checker
Desired Structure:
system_checker_project/
├── .venv/ # Virtual environment (created by you)
├── main_checker.py # Main script to run checks
├── requirements.txt # Dependency file
└── checker_pkg/ # Our package ('checker_pkg')
├── __init__.py # Makes 'checker_pkg' a package
├── cpu_checker.py # Module for CPU checks
├── mem_checker.py # Module for Memory checks
└── network_checker.py # Module for Network checks
Steps:
-
Set Up Project Directory and Environment:
- Open your terminal.
- Create the main project directory:
mkdir system_checker_project
- Navigate into it:
cd system_checker_project
- Create and activate a virtual environment:
- (Optional) Install a simple external library just to have something in
requirements.txt
, e.g.,pip install psutil
(though we won't use its full potential here, it's relevant). If you don't want external dependencies, skip the install, andrequirements.txt
will be minimal.
-
Create Package Structure and Files:
- Create the package directory:
mkdir checker_pkg
- Create the
__init__.py
file to mark it as a package:touch checker_pkg/__init__.py
- Create the module files within the package:
- Create the main script outside the package:
touch main_checker.py
- Create the package directory:
-
Populate
checker_pkg/__init__.py
:- Open
checker_pkg/__init__.py
(nano checker_pkg/__init__.py
). - Add code to make functions from modules easily accessible and define
__all__
.
# File: checker_pkg/__init__.py """ System Checker Package Initialization. Exposes core checking functions directly. """ print("Initializing checker_pkg...") # Use relative imports to bring functions into the package's namespace from .cpu_checker import check_cpu_load from .mem_checker import check_memory_usage from .network_checker import check_network_latency # Define what 'from checker_pkg import *' will import __all__ = [ 'check_cpu_load', 'check_memory_usage', 'check_network_latency' ]
- Save and exit.
- Open
-
Populate
checker_pkg/cpu_checker.py
:- Open
checker_pkg/cpu_checker.py
(nano checker_pkg/cpu_checker.py
). - Add a simple function (we simulate the check).
# File: checker_pkg/cpu_checker.py """Module for checking CPU status (simulated).""" import random import time def check_cpu_load(threshold=75.0): """ Simulates checking CPU load against a threshold. Args: threshold (float): Load percentage threshold. Returns: tuple: (status_ok: bool, current_load: float) """ print(" Checking CPU Load...") time.sleep(0.2) # Simulate work current_load = round(random.uniform(5.0, 95.0), 1) # Simulate load % status_ok = current_load < threshold print(f" [CPU Check] Current Load: {current_load}%, Threshold: {threshold}% -> OK: {status_ok}") return status_ok, current_load # Internal helper function (convention: starts with _) def _get_cpu_temp(): return random.randint(40, 80)
- Save and exit.
- Open
-
Populate
checker_pkg/mem_checker.py
:- Open
checker_pkg/mem_checker.py
(nano checker_pkg/mem_checker.py
). - Add a simple function.
# File: checker_pkg/mem_checker.py """Module for checking Memory status (simulated).""" import random import time # Example of relative import within the package IF needed: # from .cpu_checker import _get_cpu_temp # Accessing internal helper (usually not done) def check_memory_usage(threshold_percent=85.0): """ Simulates checking memory usage percentage against a threshold. Returns: tuple: (status_ok: bool, usage_percent: float) """ print(" Checking Memory Usage...") time.sleep(0.3) # Simulate work usage_percent = round(random.uniform(20.0, 98.0), 1) status_ok = usage_percent < threshold_percent # temp = _get_cpu_temp() # Example: Call internal function from sibling module print(f" [Memory Check] Usage: {usage_percent}%, Threshold: {threshold_percent}% -> OK: {status_ok}") return status_ok, usage_percent
- Save and exit.
- Open
-
Populate
checker_pkg/network_checker.py
:- Open
checker_pkg/network_checker.py
(nano checker_pkg/network_checker.py
). - Add a simple function.
# File: checker_pkg/network_checker.py """Module for checking Network status (simulated).""" import random import time def check_network_latency(target_host="8.8.8.8", max_latency_ms=100): """ Simulates checking network latency to a target. Returns: tuple: (status_ok: bool, latency_ms: int) """ print(f" Checking Network Latency to {target_host}...") time.sleep(0.5) # Simulate ping latency_ms = random.randint(5, 250) status_ok = latency_ms < max_latency_ms print(f" [Network Check] Latency: {latency_ms}ms, Max Allowed: {max_latency_ms}ms -> OK: {status_ok}") return status_ok, latency_ms
- Save and exit.
- Open
-
Populate the Main Script (
main_checker.py
):- Open
main_checker.py
(nano main_checker.py
). - Add code to import and use the functions from the package.
# File: main_checker.py """ Main script to run system checks using the checker_pkg package. """ # Import the entire package import checker_pkg import os # Example using another module alongside the package # Alternatively, import specific functions made available by __init__.py: # from checker_pkg import check_cpu_load, check_memory_usage, check_network_latency print("===== Starting System Checks =====") all_ok = True # Flag to track overall status # --- Run Checks using functions from the package --- print("\nRunning CPU Check:") cpu_ok, cpu_load = checker_pkg.check_cpu_load(threshold=90.0) # Call function via package if not cpu_ok: all_ok = False print(" -> CPU Warning: Load exceeds threshold!") print("\nRunning Memory Check:") mem_ok, mem_usage = checker_pkg.check_memory_usage() # Uses default threshold if not mem_ok: all_ok = False print(" -> Memory Warning: Usage exceeds threshold!") print("\nRunning Network Check:") net_ok, net_latency = checker_pkg.check_network_latency(max_latency_ms=150) if not net_ok: all_ok = False print(" -> Network Warning: Latency exceeds threshold!") # --- Final Report --- print("\n===== Check Summary =====") if all_ok: print("✅ All system checks passed.") else: print("⚠️ Some system checks reported warnings.") print(f"Checks run under user: {os.getlogin() if hasattr(os, 'getlogin') else 'N/A'}") # Example using 'os' module print("========================")
- Save and exit.
- Open
-
Generate
requirements.txt
:- Make sure your virtual environment is active.
- Run the
pip freeze
command: - Examine
requirements.txt
. It should containpsutil
(if you installed it) and its dependencies, or be nearly empty if you didn't install anything extra.
-
Run the Main Checker Script:
- Execute the script:
- Observe:
- You should first see the "Initializing checker_pkg..." message (printed from
checker_pkg/__init__.py
when it's first imported). - The script will then print messages as it executes each (simulated) check from the different modules (
cpu_checker
,mem_checker
,network_checker
). - The output for each check will show the simulated value and whether it passed the threshold.
- A final summary indicates if all checks passed or if warnings occurred.
- Since the checks use
random
, the results will vary each time you run it. Run it multiple times to see different outcomes.
- You should first see the "Initializing checker_pkg..." message (printed from
Code Explanation Recap:
- Package Structure: We created a directory
checker_pkg/
containing modules and an__init__.py
file, defining it as a Python package. - Modules: Each
_checker.py
file acts as a module, encapsulating functions related to a specific check (CPU, Memory, Network). __init__.py
: This file controlled the package's initialization (printing a message) and used relative imports (from .cpu_checker import ...
) to make the main checking functions directly available under thechecker_pkg
namespace. It also defined__all__
.main_checker.py
: This script, outside the package, used an absolute import (import checker_pkg
) to access the package. It then called the checking functions usingchecker_pkg.function_name(...)
.requirements.txt
: Captured the project's dependencies (even if justpsutil
or minimal base packages) for reproducibility.
Conclusion: You have successfully created a Python package (checker_pkg
) containing multiple modules, each responsible for a specific task. You used __init__.py
to structure the package and control imports. The main script demonstrated how to import and use the functionality provided by your package. This modular approach is crucial for organizing larger Python projects, making them easier to understand, maintain, and extend.
7. File Input Output Working with Files on Linux
Interacting with files is a fundamental task in almost any programming language, and it's especially pertinent in Linux where the "everything is a file" philosophy often applies (including devices and system information via pseudo-filesystems like /proc
). Python provides robust, built-in mechanisms for reading from and writing to files.
Opening Files (open()
)
The primary function for working with files is the built-in open()
. It takes a file path and a mode (and optionally encoding) and returns a file object (also called a file handle), which provides methods for interaction (read, write, close, etc.).
Syntax:
file_object = open(file_path, mode='r', encoding=None)
file_path
(str): The path to the file. This can be:- Absolute: Starting from the root directory (e.g.,
/home/user/data.txt
,/var/log/syslog
). - Relative: Relative to the current working directory where the script is run (e.g.,
my_data.txt
,../config/settings.ini
). Linux uses/
as the path separator. Theos.path
orpathlib
modules can help construct paths reliably.
- Absolute: Starting from the root directory (e.g.,
mode
(str, optional): A string indicating how the file should be opened. Defaults to'r'
(read text). Common modes:- Read Modes:
'r'
: Read Text (Default). Opens for reading text. Fails (FileNotFoundError
) if the file doesn't exist. Decodes bytes usingencoding
.'rb'
: Read Binary. Opens for reading binary data (bytes). Fails if file doesn't exist. Use for images, executables, raw data.
- Write Modes (Caution: Overwrites existing files!):
'w'
: Write Text. Opens for writing text. Truncates (empties) the file if it exists. Creates the file if it doesn't exist. Encodes strings usingencoding
.'wb'
: Write Binary. Opens for writing binary data (bytes). Truncates if exists, creates if not.
- Append Modes:
'a'
: Append Text. Opens for writing text. Appends to the end of the file if it exists. Creates if not. Doesn't truncate. Cursor starts at the end.'ab'
: Append Binary. Opens for writing binary data (bytes). Appends if exists, creates if not.
- Exclusive Creation Modes:
'x'
: Exclusive Text Creation. Creates a new text file and opens for writing. Fails (FileExistsError
) if the file already exists.'xb'
: Exclusive Binary Creation. Creates a new binary file for writing. Fails if file already exists.
- Update Modes (
+
): Append+
to other modes ('r+'
,'w+'
,'a+'
,'rb+'
,'wb+'
,'ab+'
) to allow both reading and writing on the same file object.'r+'
: Read/Write Text. File must exist. Initial position at start.'w+'
: Write/Read Text. Truncates file. Creates if not exists.'a+'
: Append/Read Text. Appends writes to end. Reading position starts at beginning. Creates if not exists.- Binary versions (
rb+
,wb+
,ab+
) work similarly with bytes.
- Read Modes:
encoding
(str, optional): Specifies the encoding to use for decoding/encoding data in text modes (r
,w
,a
,x
,r+
, etc.).- Crucial for portability and correctness. Different systems might have different default encodings.
'utf-8'
: The most common and recommended encoding. Handles a vast range of characters and is standard on Linux and the web. Explicitly specifyencoding='utf-8'
whenever working with text files unless you know you need something different.- Other common ones:
'ascii'
,'latin-1'
(ISO-8859-1). - If
None
(default), Python uses the system's default locale encoding (locale.getpreferredencoding(False)
), which can lead to unexpected behavior or errors if the file was created with a different encoding.
The with
Statement (Context Manager) - Recommended Practice
It is essential to close files after you are finished working with them. Closing ensures that:
- Any buffered data in memory is written (flushed) to the actual file on disk.
- The file descriptor (a limited system resource) associated with the open file is released back to the operating system.
While you can manually call file_object.close()
, this is error-prone. If an exception occurs before close()
is called, the file might remain open.
The with
statement provides a clean and robust way to work with resources that need cleanup (like files). It guarantees that the resource's close()
method (or equivalent cleanup) is called automatically, even if errors occur within the with
block.
Syntax:
try:
with open('my_file.txt', 'w', encoding='utf-8') as file_handle:
# 'file_handle' is the file object returned by open()
# It is only available *inside* this 'with' block
file_handle.write("This line will be written.\n")
# Perform other operations on file_handle...
# result = 10 / 0 # If an error happens here...
# ...the 'with' statement ensures file_handle.close() is called *before*
# the error propagates out or the block finishes normally.
# 'file_handle' is no longer accessible here (usually) and the file is closed.
print("File writing complete (file automatically closed).")
except FileNotFoundError:
print("Error: Could not find the file for reading (if mode was 'r').")
except PermissionError:
print("Error: Permission denied to access the file.")
except IOError as e: # Catch other general I/O errors
print(f"An I/O error occurred: {e}")
except Exception as e:
print(f"An unexpected error occurred: {e}")
with open(...)
syntax when working with files in Python.
Reading from Files
Once a file is opened in a read mode (e.g., 'r'
, 'rb'
, 'r+'
), you can use these common methods on the file object:
read(size=-1)
: Reads and returns up tosize
bytes (in binary mode) or characters (in text mode). Ifsize
is negative or omitted, reads and returns the entire content of the file. Warning: Avoid reading huge files entirely into memory withread()
as it can cause performance issues or crashes.readline()
: Reads and returns a single line from the file, up to and including the newline character (\n
). Returns an empty string (''
) when the end of the file (EOF) is reached.readlines()
: Reads all remaining lines from the file and returns them as a list of strings, where each string represents a line and includes the trailing newline character (\n
). Can consume a lot of memory for large files.
Iterating over the File Object (Most Pythonic and Memory-Efficient):
The best way to read a text file line by line is to iterate directly over the file object within a with
block. Python handles buffering efficiently.
filepath = '/etc/hostname' # Example readable file
lines_read = 0
try:
with open(filepath, 'r', encoding='utf-8') as f:
print(f"Reading contents of '{filepath}' line by line:")
for line in f: # Iterates line by line efficiently
lines_read += 1
# Process each line (strip() removes leading/trailing whitespace, including '\n')
print(f" Line {lines_read}: {line.strip()}")
except (FileNotFoundError, PermissionError, IOError) as e:
print(f"Error reading {filepath}: {e}")
# --- Other reading methods ---
try:
with open(filepath, 'r', encoding='utf-8') as f:
# Read first line
first_line = f.readline()
print(f"\nFirst line using readline(): {first_line.strip()}")
# Read the rest of the content
remaining_content = f.read()
print(f"Remaining content using read():\n---\n{remaining_content.strip()}\n---")
# Read all lines into a list (use with caution on large files)
with open(filepath, 'r', encoding='utf-8') as f:
all_lines_list = f.readlines()
print(f"\nAll lines read with readlines(): {all_lines_list}")
# Note: lines in the list still contain '\n'
except (FileNotFoundError, PermissionError, IOError) as e:
print(f"Error reading {filepath} again: {e}")
Writing to Files
When a file is opened in a write or append mode (e.g., 'w'
, 'wb'
, 'a'
), use these methods:
write(string_or_bytes)
: Writes the given string (in text mode) orbytes
object (in binary mode) to the file. Returns the number of characters/bytes written. Remember that strings in text mode will be encoded using the specifiedencoding
. You must include newline characters (\n
) explicitly in your strings if you want line breaks in the file.writelines(list_of_strings_or_bytes)
: Writes each item from the iterable (e.g., a list) to the file sequentially. Important: Likewrite()
,writelines()
does not add newline characters automatically between items. You need to ensure the strings in your list already include them if desired.
output_filename = "mylog.txt"
lines_to_log = [
"INFO: Application started.\n", # Include newline
"DEBUG: Configuration loaded.\n",
"WARNING: Disk space low.\n"
]
try:
# Using 'w' mode - will overwrite if file exists
with open(output_filename, 'w', encoding='utf-8') as outfile:
outfile.write("Log File Header\n")
outfile.write("="*20 + "\n") # Write separator
# Write lines from the list
outfile.writelines(lines_to_log)
# Write another line individually
import datetime
timestamp = datetime.datetime.now().isoformat()
outfile.write(f"ERROR: Critical failure at {timestamp}\n")
print(f"Log data written to '{output_filename}' (overwrite mode).")
# Using 'a' mode - append to the file
with open(output_filename, 'a', encoding='utf-8') as outfile:
outfile.write("INFO: Application shutting down.\n")
print(f"Appended shutdown message to '{output_filename}'.")
except (PermissionError, IOError) as e:
print(f"Error writing to {output_filename}: {e}")
Working with File Paths (os.path
and pathlib
)
Python provides modules to help manipulate file paths in a way that's compatible across different operating systems (though we focus on Linux here).
os.path
(Traditional): Provides functions for path manipulation.pathlib
(Modern, Recommended): Introduced in Python 3.4, offers an object-oriented approach to file paths. It's generally considered more intuitive and powerful.
import os.path
from pathlib import Path # Import the Path class
# --- Using os.path ---
print("\n--- Using os.path ---")
path_str = "/home/user/documents/report.txt"
print(f"Path String: {path_str}")
print(f"Basename: {os.path.basename(path_str)}") # report.txt
print(f"Dirname: {os.path.dirname(path_str)}") # /home/user/documents
print(f"Exists? {os.path.exists(path_str)}") # False (likely)
print(f"Is File? {os.path.isfile(path_str)}")
print(f"Is Dir? {os.path.isdir(os.path.dirname(path_str))}") # Check if dir part exists
# Join path components safely
new_path_str = os.path.join(os.path.dirname(path_str), "images", "logo.png")
print(f"Joined Path: {new_path_str}") # /home/user/documents/images/logo.png
# Get absolute path
print(f"Absolute path of '.': {os.path.abspath('.')}")
# Get file size
try:
size = os.path.getsize("/etc/hosts")
print(f"Size of /etc/hosts: {size} bytes")
except OSError as e:
print(f"Could not get size of /etc/hosts: {e}")
# --- Using pathlib (Recommended) ---
print("\n--- Using pathlib ---")
# Create a Path object
p = Path("/home/user/documents/report.txt")
print(f"Path Object: {p}")
print(f"Name (basename): {p.name}") # report.txt
print(f"Parent (dirname): {p.parent}") # /home/user/documents
print(f"Exists? {p.exists()}") # False (likely)
print(f"Is File? {p.is_file()}")
print(f"Is Dir? {p.parent.is_dir()}")
# Join path components using the / operator
new_p = p.parent / "images" / "logo.png"
print(f"Joined Path Object: {new_p}") # /home/user/documents/images/logo.png
# Get absolute path
print(f"Absolute path of '.': {Path('.').resolve()}") # .resolve() makes it absolute
# Get file size
try:
hosts_path = Path("/etc/hosts")
if hosts_path.is_file():
size = hosts_path.stat().st_size
print(f"Size of {hosts_path}: {size} bytes")
except OSError as e:
print(f"Could not get stat of {hosts_path}: {e}")
# Other useful pathlib methods:
# p.suffix -> '.txt'
# p.stem -> 'report'
# p.home() -> Path object for user's home directory
# list(Path('/etc').glob('*.conf')) -> Find all .conf files in /etc
# new_p.parent.mkdir(parents=True, exist_ok=True) # Create directories
pathlib
often leads to cleaner and more readable code for file system interactions.
Working with JSON and CSV Files
Python's standard library excels at handling common structured data formats.
JSON (json
module): JavaScript Object Notation. Human-readable text format for data interchange, widely used in APIs and configuration. Maps closely to Python dictionaries and lists.
json.dump(python_obj, file_handle, indent=None)
: Writes (serializes) a Python object (dict, list, etc.) to a file handle in JSON format.indent
(e.g.,4
) makes it pretty-printed.json.dumps(python_obj, indent=None)
: Serializes a Python object to a JSON formatted string.json.load(file_handle)
: Reads (deserializes) JSON data from a file handle into a Python object.json.loads(json_string)
: Deserializes a JSON formatted string into a Python object.
import json
# Python data
system_config = {
"hostname": "matrix-server",
"ip_address": "10.0.1.5",
"services": ["nginx", "postgresql", "redis"],
"monitoring": {
"enabled": True,
"interval_seconds": 300
}
}
# Write to JSON file
json_filepath = "config.json"
try:
with open(json_filepath, 'w', encoding='utf-8') as f_json:
json.dump(system_config, f_json, indent=4) # Pretty print with 4 spaces
print(f"Data written to {json_filepath}")
except IOError as e:
print(f"Error writing JSON: {e}")
# Read from JSON file
try:
with open(json_filepath, 'r', encoding='utf-8') as f_json:
loaded_config = json.load(f_json)
print("\nData loaded from JSON:")
print(loaded_config)
print(f"Monitoring enabled? {loaded_config['monitoring']['enabled']}")
except (FileNotFoundError, json.JSONDecodeError, IOError) as e:
print(f"Error reading or parsing JSON: {e}")
CSV (csv
module): Comma-Separated Values. Common format for tabular data.
csv.writer(file_handle)
: Creates a writer object to write rows to a CSV file.writer.writerow(list_or_tuple)
: Writes a single row.writer.writerows(list_of_lists)
: Writes multiple rows.
csv.reader(file_handle)
: Creates a reader object to iterate over rows in a CSV file. Each row is returned as a list of strings.csv.DictWriter(file_handle, fieldnames=list_of_headers)
: Writes rows from dictionaries. Requires header names (fieldnames
).csv.DictReader(file_handle)
: Reads rows into dictionaries, using the first row as keys (headers).
import csv
# Data for CSV
user_data = [
['UserID', 'Username', 'Department', 'LastLogin'],
[101, 'alice', 'Engineering', '2023-10-26T10:00:00Z'],
[102, 'bob', 'Marketing', '2023-10-25T15:30:00Z'],
[103, 'charlie', 'Engineering', '2023-10-26T11:15:00Z']
]
csv_filepath = "users.csv"
# Write to CSV file
try:
# Use newline='' to prevent extra blank rows on some systems
with open(csv_filepath, 'w', newline='', encoding='utf-8') as f_csv:
writer = csv.writer(f_csv)
writer.writerows(user_data) # Write all rows
print(f"\nData written to {csv_filepath}")
except IOError as e:
print(f"Error writing CSV: {e}")
# Read from CSV file using reader
try:
with open(csv_filepath, 'r', newline='', encoding='utf-8') as f_csv:
reader = csv.reader(f_csv)
header = next(reader) # Read the header row separately
print(f"\nCSV Header: {header}")
print("CSV Data:")
for row in reader: # Iterate over remaining rows
# row is a list of strings, e.g., ['101', 'alice', 'Engineering', ...]
print(f" - User: {row[1]}, Dept: {row[2]}")
except (FileNotFoundError, StopIteration, IOError) as e: # StopIteration if file is empty
print(f"Error reading CSV: {e}")
# Read from CSV using DictReader
try:
with open(csv_filepath, 'r', newline='', encoding='utf-8') as f_csv:
reader = csv.DictReader(f_csv) # Uses first row as keys
print("\nCSV Data (as Dicts):")
for row_dict in reader:
# row_dict is a dict, e.g., {'UserID': '101', 'Username': 'alice', ...}
print(f" - ID: {row_dict['UserID']}, User: {row_dict['Username']}, Login: {row_dict['LastLogin']}")
except (FileNotFoundError, IOError) as e:
print(f"Error reading CSV with DictReader: {e}")
File I/O is essential for saving program state, logging, reading configuration, processing data, and interacting with the Linux environment. Master the with open(...)
pattern and choose the right mode and encoding for your needs.
Workshop File Input Output
This workshop involves reading data from specific Linux system files found in the /proc
filesystem (a pseudo-filesystem providing kernel/process information), parsing relevant information, and writing a summary report to both a formatted text file and a JSON file.
Goal: Practice reading system-specific files, parsing text data using string methods, handling potential file access errors, and writing structured output to different file formats (text, JSON) using file I/O operations and relevant modules.
Project: Linux System Information Reporter
Target Files: /proc/cpuinfo
(CPU details) and /proc/meminfo
(Memory details). Note: Accessing /proc
is specific to Linux and similar Unix-like systems.
Steps:
-
Set Up:
- Navigate to your project directory.
- Ensure your virtual environment is active.
- Create a new Python file named
sys_reporter.py
(nano sys_reporter.py
).
-
Write the Code: Enter the following Python code. This script defines functions to parse the
/proc
files and functions to write the output.# File: sys_reporter.py """ Reads system information from /proc/cpuinfo and /proc/meminfo, parses key details, and writes a summary report to text and JSON files. """ import json from datetime import datetime # To timestamp the report import os # For basic path checks if needed, though /proc usually exists # --- Constants --- PROC_CPUINFO_PATH = '/proc/cpuinfo' PROC_MEMINFO_PATH = '/proc/meminfo' REPORT_TEXT_FILE = 'system_report.txt' REPORT_JSON_FILE = 'system_report.json' # --- Parsing Functions --- def parse_cpu_info(): """ Parses /proc/cpuinfo to extract CPU model name and count of logical cores. Returns: dict: {'model_name': str, 'core_count': int} or default values on error. """ cpu_data = {'model_name': 'N/A', 'core_count': 0} try: processors = set() # Use set to count unique processor entries (logical cores) with open(PROC_CPUINFO_PATH, 'r', encoding='utf-8') as f: for line in f: line = line.strip() if not line: continue if line.startswith('model name'): # Take the first model name found if cpu_data['model_name'] == 'N/A': try: cpu_data['model_name'] = line.split(':', 1)[1].strip() except IndexError: pass # Ignore malformed lines elif line.startswith('processor'): try: processors.add(line.split(':', 1)[1].strip()) except IndexError: pass # Ignore malformed lines cpu_data['core_count'] = len(processors) if processors else 1 # Assume 1 if parse fails except FileNotFoundError: print(f"Error: Cannot find {PROC_CPUINFO_PATH}. CPU info unavailable.") except PermissionError: print(f"Error: Permission denied reading {PROC_CPUINFO_PATH}.") except Exception as e: print(f"Warning: Unexpected error parsing {PROC_CPUINFO_PATH}: {e}") return cpu_data def parse_memory_info(): """ Parses /proc/meminfo for Total and Available memory (returns KiB). Returns: dict: {'mem_total_kib': int or 'N/A', 'mem_available_kib': int or 'N/A'} """ mem_data = {'mem_total_kib': 'N/A', 'mem_available_kib': 'N/A'} keys_map = {'MemTotal': 'mem_total_kib', 'MemAvailable': 'mem_available_kib'} keys_to_find = set(keys_map.keys()) keys_found = set() try: with open(PROC_MEMINFO_PATH, 'r', encoding='utf-8') as f: for line in f: line = line.strip() if not line: continue parts = line.split(':', 1) if len(parts) == 2: key = parts[0] if key in keys_map: try: # Value is like '16384 kB', split and take number value_kib = int(parts[1].strip().split()[0]) mem_data[keys_map[key]] = value_kib keys_found.add(key) except (ValueError, IndexError): print(f"Warning: Could not parse value for {key} in {PROC_MEMINFO_PATH}") # Optimization: Stop if we found all needed keys if keys_found == keys_to_find: break # Handle case where MemAvailable might not be present (try MemFree) if 'MemAvailable' not in keys_found: print("Warning: 'MemAvailable' not found. Checking 'MemFree' (less accurate for available).") with open(PROC_MEMINFO_PATH, 'r', encoding='utf-8') as f: # Re-open needed for line in f: if line.strip().startswith('MemFree'): try: mem_data['mem_available_kib'] = int(line.split(':')[1].strip().split()[0]) print("Using MemFree value.") break except (ValueError, IndexError): pass except FileNotFoundError: print(f"Error: Cannot find {PROC_MEMINFO_PATH}. Memory info unavailable.") except PermissionError: print(f"Error: Permission denied reading {PROC_MEMINFO_PATH}.") except Exception as e: print(f"Warning: Unexpected error parsing {PROC_MEMINFO_PATH}: {e}") return mem_data # --- Output Writing Functions --- def write_text_report(report_data): """Writes the collected system information to a formatted text file.""" try: with open(REPORT_TEXT_FILE, 'w', encoding='utf-8') as f: f.write("--- Linux System Report ---\n") f.write(f"Generated: {report_data['timestamp']}\n") f.write("="*30 + "\n\n") f.write("[CPU Information]\n") f.write(f" Model Name: {report_data['cpu']['model_name']}\n") f.write(f" Core Count: {report_data['cpu']['core_count']}\n\n") f.write("[Memory Information]\n") total_kib = report_data['memory']['mem_total_kib'] avail_kib = report_data['memory']['mem_available_kib'] f.write(f" Total Memory: {total_kib} KiB\n") f.write(f" Available Memory: {avail_kib} KiB\n") # Add MiB/GiB conversion for readability try: if isinstance(total_kib, int): total_gib = round(total_kib / (1024**2), 2) f.write(f" Total Memory (GiB): ~{total_gib} GiB\n") if isinstance(avail_kib, int): avail_gib = round(avail_kib / (1024**2), 2) f.write(f" Available Memory (GiB): ~{avail_gib} GiB\n") except Exception: pass # Ignore conversion errors f.write("\n" + "="*30 + "\n") print(f"Text report successfully written to {REPORT_TEXT_FILE}") except (PermissionError, IOError) as e: print(f"Error writing text report file {REPORT_TEXT_FILE}: {e}") except Exception as e: print(f"Unexpected error generating text report: {e}") def write_json_report(report_data): """Writes the collected system information to a JSON file.""" try: with open(REPORT_JSON_FILE, 'w', encoding='utf-8') as f: # Use indent for pretty printing json.dump(report_data, f, indent=4) print(f"JSON report successfully written to {REPORT_JSON_FILE}") except (PermissionError, IOError) as e: print(f"Error writing JSON report file {REPORT_JSON_FILE}: {e}") except TypeError as e: # Handle data that cannot be serialized print(f"Error: Data cannot be serialized to JSON: {e}") except Exception as e: print(f"Unexpected error generating JSON report: {e}") # --- Main Execution --- if __name__ == "__main__": print("Gathering system information from /proc...") # Gather data using parsing functions cpu_details = parse_cpu_info() memory_details = parse_memory_info() # Structure the final report data report = { "timestamp": datetime.now().isoformat(), "cpu": cpu_details, "memory": memory_details # Future: Could add hostname, uptime, disk info etc. here } print("Information gathered. Writing reports...") # Write reports to files write_text_report(report) write_json_report(report) print("\nScript finished.")
-
Save and Exit: Save the
sys_reporter.py
file. -
Run the Script:
- Execute the script from your terminal (ensure virtual env is active):
- Observe:
- The script will print status messages as it parses the files and writes the reports.
- You might see warnings if parsing fails for some lines, but the script should attempt to continue. Critical errors like
FileNotFoundError
orPermissionError
will be printed as errors. - Check your project directory. Two new files should appear:
system_report.txt
andsystem_report.json
.
-
Examine the Output Files:
- Text Report (
system_report.txt
): Open this file using a text editor orcat
/less
. Verify that it contains a timestamp, CPU model, core count, and total/available memory information in a human-readable format. Check if the GiB conversions look reasonable. - JSON Report (
system_report.json
): Examine this file. Verify that it contains the same information structured as a nested JSON object, including the timestamp. This format is ideal for machine processing or interacting with APIs.
- Text Report (
Code Explanation Recap:
- Parsing Functions (
parse_cpu_info
,parse_memory_info
):- Opened specific
/proc
files usingwith open(...)
. - Iterated line by line.
- Used string methods (
.strip()
,.startswith()
,.split()
) to extract relevant data. - Included
try...except
blocks to handle potentialFileNotFoundError
,PermissionError
,IndexError
(from.split()
),ValueError
(fromint()
), and genericException
during file reading and parsing, making the functions more robust. - Returned dictionaries containing the extracted data or default 'N/A' values on error.
- Opened specific
- Writing Functions (
write_text_report
,write_json_report
):- Took the combined report dictionary as input.
- Opened output files using
with open(...)
in write mode ('w'
) with UTF-8 encoding. write_text_report
used f-strings and.write()
to create a formatted text layout.write_json_report
usedjson.dump(..., indent=4)
to serialize the Python dictionary into a pretty-printed JSON file.- Included
try...except
blocks to handleIOError
,PermissionError
, and potentialTypeError
(for JSON) during file writing.
- Main Block: Orchestrated the process: called parsing functions, structured the results into a final
report
dictionary (including a timestamp), and called the writing functions to generate the output files.
Conclusion: You have built a practical Linux utility that reads and parses data directly from the /proc
filesystem – a common technique for system introspection. You reinforced core file I/O skills (with open
, reading lines, writing text), practiced string manipulation for data extraction, implemented error handling for file operations and parsing, and generated structured output in both human-readable text and machine-readable JSON formats.
8. Error Handling Exceptions
Errors and unexpected situations are a normal part of software development. A program might encounter issues like trying to open a non-existent file, dividing by zero, receiving invalid user input, or losing a network connection. Python uses an exception handling mechanism to manage these events gracefully, preventing the program from crashing and allowing you to control how errors are dealt with.
Understanding Exceptions
When an error occurs during the execution of Python code, Python creates an exception object. This object contains information about the error (type, location, message). If this exception isn't "caught" and handled by your code, it propagates up the call stack. If it reaches the top level without being handled, the program terminates and prints a traceback, showing where the error occurred.
Common Built-in Exceptions:
Python has many built-in exception types, each representing a different kind of error. Some you'll frequently encounter include:
SyntaxError
: Raised by the parser when it finds code that violates Python's grammar rules (e.g., missing colon, invalid keyword). This usually prevents the script from running at all.IndentationError
/TabError
: Subclasses ofSyntaxError
, raised for incorrect indentation or mixing tabs and spaces.NameError
: Trying to use a variable or function name that hasn't been defined in the current scope.TypeError
: Applying an operation or function to an object of an inappropriate type (e.g.,len(123)
or"hello" + 5
).ValueError
: An operation receives an argument of the correct type but an inappropriate value (e.g.,int('abc')
).IndexError
: Trying to access an element in a sequence (list, tuple) using an index that is outside the valid range.KeyError
: Trying to access a dictionary key that does not exist.FileNotFoundError
: Trying to open a file in read mode ('r'
) that doesn't exist. Subclass ofOSError
.PermissionError
: Trying to perform an operation (like reading/writing a file) without sufficient operating system permissions. Subclass ofOSError
.ZeroDivisionError
: Attempting to divide a number by zero.AttributeError
: Trying to access an attribute or method that an object doesn't possess.ImportError
/ModuleNotFoundError
: Python cannot find the module you're trying to import.ModuleNotFoundError
is a subclass ofImportError
.KeyboardInterrupt
: Raised when the user hits the interrupt key (usuallyCtrl+C
) during execution.MemoryError
: The program runs out of available memory.RecursionError
: Exceeding the maximum recursion depth (often due to infinite recursion in function calls).OSError
: Base class for various OS-related errors (includingFileNotFoundError
,PermissionError
,ConnectionError
, etc.).
Handling Exceptions The try...except
Block
To prevent exceptions from crashing your program, you use the try...except
block.
Syntax:
try:
# --- Code that might raise an exception ---
# This is the "guarded" block.
risky_operation()
print("Operation in 'try' block succeeded.")
except SpecificExceptionType1:
# --- Handler for SpecificExceptionType1 ---
# This block executes ONLY if SpecificExceptionType1 (or a subclass of it)
# was raised in the 'try' block.
print("Caught SpecificExceptionType1!")
# Perform recovery actions, log the error, etc.
except SpecificExceptionType2 as error_variable:
# --- Handler for SpecificExceptionType2 ---
# Catches SpecificExceptionType2 (or its subclasses).
# The 'as error_variable' assigns the actual exception object
# to 'error_variable' so you can inspect it.
print(f"Caught SpecificExceptionType2: {error_variable}")
# You can access details: print(type(error_variable), error_variable.args)
except (ExceptionType3, ExceptionType4) as ex:
# --- Handler for Multiple Exception Types ---
# Catches either ExceptionType3 or ExceptionType4 (or their subclasses).
print(f"Caught either ExceptionType3 or 4: {ex}")
except Exception as general_error:
# --- Handler for any other standard exception ---
# Catching the base 'Exception' class catches almost all built-in
# exceptions (that aren't system-exiting ones like SystemExit or KeyboardInterrupt).
# Use this sparingly, often at a high level for logging unexpected errors.
# WARNING: Avoid overly broad 'except:' without specifying Exception,
# as it can mask serious problems.
print(f"An unexpected error occurred: {general_error}")
# Code here executes after the 'try' block completes successfully,
# OR after one of the 'except' blocks finishes handling an exception.
print("Continuing execution after the try-except block.")
Execution Flow:
- Python executes the code inside the
try
block. - If no exception occurs: The entire
try
block finishes, allexcept
blocks are skipped, and execution continues after thetry...except
structure. - If an exception does occur in the
try
block:- The rest of the code within the
try
block is immediately skipped. - Python looks for an
except
block that matches the type of the raised exception. It checks theexcept
clauses sequentially from top to bottom. - An
except
block matches if the raised exception is of the same type or a subclass of the type specified in theexcept
clause. - The first matching
except
block is executed. - After the matching
except
block finishes, execution continues after the entiretry...except
structure (it does not go back into thetry
block or check furtherexcept
blocks). - If no matching
except
block is found, the exception is considered "unhandled" at this level and propagates up the call stack. If it remains unhandled all the way up, the program terminates with a traceback.
- The rest of the code within the
Example: Safely converting user input to a number.
input_str = input("Enter a number: ")
number = None # Initialize to None
try:
number = float(input_str) # Attempt conversion
print(f"Successfully converted to float: {number}")
except ValueError:
# Handle the case where input_str cannot be converted to a float
print(f"Error: '{input_str}' is not a valid number.")
except Exception as e:
# Catch any other unexpected error during conversion
print(f"An unexpected error occurred: {e}")
# Now you can check if 'number' was successfully assigned
if number is not None:
print(f"The number multiplied by 2 is: {number * 2}")
else:
print("Cannot perform calculation due to invalid input.")
The else
Block in try...except
You can add an optional else
block after all the except
blocks. The code inside the else
block is executed if and only if the try
block completes without raising any exceptions.
Syntax:
try:
# Operation that might fail
result = risky_calculation()
except ZeroDivisionError:
print("Calculation failed: Division by zero.")
else:
# This runs ONLY if the 'try' block succeeded
print(f"Calculation successful. Result: {result}")
# It's often cleaner to put code that depends on the 'try' block's
# success here, rather than at the end of the 'try' block itself.
Using else
can improve clarity by separating the core "try this" logic from the "if that succeeded, then do this" logic.
The finally
Block
An optional finally
block can be placed at the very end of the try...except...else
structure. The code inside the finally
block is guaranteed to execute, regardless of what happened in the try
, except
, or else
blocks. It executes:
- If the
try
block completes successfully. - If an exception occurs in
try
and is handled by anexcept
block. - If an exception occurs in
try
and is not handled by anyexcept
block (thefinally
block runs just before the exception propagates). - Even if a
return
,break
, orcontinue
statement is encountered within thetry
orexcept
blocks.
Purpose: Primarily used for cleanup actions that must happen, such as releasing resources (closing files, closing network connections, releasing locks) to prevent leaks.
Syntax:
lock = None # Placeholder for a resource lock
try:
lock = acquire_resource_lock() # Might fail
print("Resource locked. Performing critical operations...")
# Perform operations with the resource...
# data = read_data() # Might fail
# result = 10 / 0 # Might fail
print("Operations completed.")
except Exception as e:
print(f"An error occurred during operations: {e}")
# Handle or log the error
finally:
# This block ALWAYS executes
print("Executing 'finally' block...")
if lock is not None: # Check if lock was successfully acquired
release_resource_lock(lock) # Must release the lock
print("Resource lock released.")
else:
print("No lock to release.")
print("After the try-finally structure.")
Note: For file handling, the with
statement is preferred over try...finally
because it's more concise and achieves the same goal of guaranteed cleanup (file.close()
).
Order of Execution: try
-> [except
blocks if exception matches] -> [else
block if no exception occurred] -> finally
block (always).
Raising Exceptions (raise
)
You aren't limited to just handling exceptions raised by Python; you can also raise exceptions deliberately in your own code using the raise
statement. This is useful for:
- Signaling Errors: Indicate that an error condition specific to your program's logic has occurred (e.g., invalid input that passes type checks but violates business rules).
- Translating Exceptions: Catch a specific low-level exception and raise a more informative, application-specific exception instead.
- Re-raising Exceptions: Catch an exception, perform some action (like logging), and then re-raise the same exception to let higher-level code handle it further.
Syntax:
# 1. Raise a specific built-in exception with a message
user_age = int(input("Enter age: "))
if user_age < 0:
raise ValueError("Age cannot be negative.")
# 2. Raise an instance of an exception
err = TypeError("Incompatible data types provided for processing.")
# raise err
# 3. Re-raise the current exception (only valid inside an 'except' block)
try:
risky_operation()
except OSError as os_err:
log_os_error(os_err) # Log the specific OS error
raise # Re-raises the caught 'os_err'
# 4. Define and raise custom exceptions (Good practice for libraries/large apps)
class NetworkConfigError(Exception): # Inherit from Exception or a more specific base
"""Custom exception for network configuration issues."""
pass # Often just need the custom type, can add methods later
def apply_network_settings(settings):
if not settings.get("ip_address"):
raise NetworkConfigError("Missing required 'ip_address' in settings.")
if not settings.get("subnet_mask"):
raise NetworkConfigError("Missing required 'subnet_mask'.")
print("Applying network settings...")
# ... apply settings ...
config1 = {"ip_address": "192.168.1.100"} # Missing subnet
config2 = {"ip_address": "192.168.1.101", "subnet_mask": "255.255.255.0"}
try:
apply_network_settings(config2) # Should succeed
apply_network_settings(config1) # Should raise NetworkConfigError
except NetworkConfigError as net_err:
print(f"Configuration Error: {net_err}")
except Exception as e:
print(f"An unexpected error: {e}")
Best Practices for Error Handling
- Be Specific in
except
: Catch the most specific exception types you anticipate rather than genericException
. Avoidexcept:
without any type. - Don't Silence Errors: Avoid empty
except:
blocks orexcept: pass
. If you catch an exception, handle it meaningfully (log it, provide a default, clean up, inform the user, or re-raise it). Masking errors makes debugging incredibly hard. - Handle Expected Errors: Use
try...except
for situations you know might reasonably fail (file not found, invalid user input, network timeout). - Let Unexpected Errors Propagate (Often): Don't wrap every line in
try...except
. Let programming errors (NameError
,TypeError
due to bugs, etc.) propagate during development so they crash and provide a traceback, making them easier to find and fix. High-level handlers can catch these for logging in production if needed. - Clean Up Resources: Use
finally
or (preferably)with
statements to ensure resources like files, network sockets, or locks are always released. - Use Custom Exceptions: Define your own exception classes for application-specific errors to make high-level handling cleaner and more meaningful.
- Provide Context: When raising exceptions, include informative error messages. When handling, log relevant context information.
Robust exception handling is crucial for writing reliable Python applications that can recover from or gracefully report errors.
Workshop Error Handling
This workshop focuses on enhancing the "Linux System Information Reporter" script (sys_reporter.py
) from the previous workshop by adding more specific and robust error handling using try...except
blocks for both file parsing and file writing operations.
Goal: Make the script more resilient to potential issues like missing /proc
files, permission errors when reading /proc
or writing reports, or unexpected file content formats.
Project: Robust Linux System Information Reporter
Steps:
-
Set Up:
- Navigate to the project directory containing your
sys_reporter.py
file. - Ensure your virtual environment is active.
- Open
sys_reporter.py
in your text editor (nano sys_reporter.py
).
- Navigate to the project directory containing your
-
Review Existing Error Handling: The previous version already had some basic
try...except
blocks. We will now make them more specific and add handling for more potential issues. -
Enhance
parse_cpu_info
:- Add explicit
PermissionError
handling. - Catch
IndexError
specifically duringsplit()
operations for potentially malformed lines. - Refine the general
Exception
catch message.
# Inside parse_cpu_info function: def parse_cpu_info(): """ Parses /proc/cpuinfo to extract CPU model name and count of logical cores. Returns: dict: {'model_name': str, 'core_count': int} or default values on error. """ cpu_data = {'model_name': 'N/A', 'core_count': 0} try: processors = set() # Use set to count unique processor entries (logical cores) with open(PROC_CPUINFO_PATH, 'r', encoding='utf-8') as f: for line in f: line = line.strip() if not line: continue try: # Inner try for parsing individual lines if line.startswith('model name'): # Take the first model name found if cpu_data['model_name'] == 'N/A': cpu_data['model_name'] = line.split(':', 1)[1].strip() elif line.startswith('processor'): processors.add(line.split(':', 1)[1].strip()) # Catch error if line doesn't have ':' or value part is missing except IndexError: print(f"Warning: Malformed line in {PROC_CPUINFO_PATH}: '{line}'") # Continue to next line if one line is bad cpu_data['core_count'] = len(processors) if processors else 1 # Assume 1 if parse fails except FileNotFoundError: print(f"Error: Critical file {PROC_CPUINFO_PATH} not found. Cannot get CPU info.") # Potentially return or raise a more specific error if needed downstream except PermissionError: print(f"Error: Permission denied reading {PROC_CPUINFO_PATH}. Check script permissions.") except IOError as e: # Catch other OS-level I/O errors print(f"Error: An I/O error occurred reading {PROC_CPUINFO_PATH}: {e}") except Exception as e: # Catch any other unexpected non-I/O error during processing # This is more for unexpected logic errors in the parsing itself print(f"Warning: An unexpected error occurred processing {PROC_CPUINFO_PATH}: {e} ({type(e).__name__})") return cpu_data
- Add explicit
-
Enhance
parse_memory_info
:- Add explicit
PermissionError
. - Add specific
ValueError
andIndexError
handling when converting the memory value string toint
. - Improve error reporting and fallback logic messaging.
# Inside parse_memory_info function: def parse_memory_info(): """ Parses /proc/meminfo for Total and Available memory (returns KiB). Returns: dict: {'mem_total_kib': int or 'N/A', 'mem_available_kib': int or 'N/A'} """ mem_data = {'mem_total_kib': 'N/A', 'mem_available_kib': 'N/A'} keys_map = {'MemTotal': 'mem_total_kib', 'MemAvailable': 'mem_available_kib'} keys_to_find = set(keys_map.keys()) keys_found = set() try: with open(PROC_MEMINFO_PATH, 'r', encoding='utf-8') as f: for line in f: line = line.strip() if not line: continue parts = line.split(':', 1) if len(parts) == 2: key = parts[0] if key in keys_map: try: # Value is like '16384 kB', split and take number value_str_parts = parts[1].strip().split() value_kib = int(value_str_parts[0]) # Convert first part to int mem_data[keys_map[key]] = value_kib keys_found.add(key) # More specific error handling for value parsing except (ValueError, IndexError) as parse_err: print(f"Warning: Could not parse value for {key} in {PROC_MEMINFO_PATH}. Line: '{line}', Error: {parse_err}") # Keep default 'N/A' for this key # Optimization: Stop if we found all needed keys if keys_found == keys_to_find: break # --- Fallback logic for MemAvailable --- if 'MemAvailable' not in keys_found: print("Warning: 'MemAvailable' not found. Checking 'MemFree' (less accurate for available).") try: # Add try-except around the fallback read too with open(PROC_MEMINFO_PATH, 'r', encoding='utf-8') as f_fallback: for line_fallback in f_fallback: if line_fallback.strip().startswith('MemFree'): try: mem_data['mem_available_kib'] = int(line_fallback.split(':')[1].strip().split()[0]) print(" -> Using MemFree value as fallback for available memory.") break except (ValueError, IndexError) as fallback_parse_err: print(f"Warning: Could not parse MemFree value. Error: {fallback_parse_err}") break # Stop trying if MemFree line is malformed except Exception as fallback_read_err: # Catch errors during fallback file read print(f"Warning: Error trying to read {PROC_MEMINFO_PATH} for MemFree fallback: {fallback_read_err}") except FileNotFoundError: print(f"Error: Critical file {PROC_MEMINFO_PATH} not found. Cannot get Memory info.") except PermissionError: print(f"Error: Permission denied reading {PROC_MEMINFO_PATH}. Check script permissions.") except IOError as e: print(f"Error: An I/O error occurred reading {PROC_MEMINFO_PATH}: {e}") except Exception as e: print(f"Warning: An unexpected error occurred processing {PROC_MEMINFO_PATH}: {e} ({type(e).__name__})") return mem_data
- Add explicit
-
Enhance Writing Functions:
- Add explicit
PermissionError
handling for cases where the script might not have write access to the current directory. - Keep the
IOError
for other file system write issues. - Keep the
TypeError
for JSON serialization issues.
# Inside write_text_report function: def write_text_report(report_data): """Writes the collected system information to a formatted text file.""" try: with open(REPORT_TEXT_FILE, 'w', encoding='utf-8') as f: # ... (writing logic remains the same) ... f.write("--- Linux System Report ---\n") f.write(f"Generated: {report_data['timestamp']}\n") f.write("="*30 + "\n\n") f.write("[CPU Information]\n") f.write(f" Model Name: {report_data['cpu']['model_name']}\n") f.write(f" Core Count: {report_data['cpu']['core_count']}\n\n") f.write("[Memory Information]\n") total_kib = report_data['memory']['mem_total_kib'] avail_kib = report_data['memory']['mem_available_kib'] f.write(f" Total Memory: {total_kib} KiB\n") f.write(f" Available Memory: {avail_kib} KiB\n") try: if isinstance(total_kib, int): total_gib = round(total_kib / (1024**2), 2) f.write(f" Total Memory (GiB): ~{total_gib} GiB\n") if isinstance(avail_kib, int): avail_gib = round(avail_kib / (1024**2), 2) f.write(f" Available Memory (GiB): ~{avail_gib} GiB\n") except Exception: pass f.write("\n" + "="*30 + "\n") print(f"Text report successfully written to {REPORT_TEXT_FILE}") except PermissionError: print(f"Error: Permission denied to write text report to {REPORT_TEXT_FILE}. Check directory permissions.") except IOError as e: print(f"Error: An I/O error occurred writing text report file {REPORT_TEXT_FILE}: {e}") except Exception as e: print(f"Unexpected error generating text report: {e} ({type(e).__name__})") # Inside write_json_report function: def write_json_report(report_data): """Writes the collected system information to a JSON file.""" try: with open(REPORT_JSON_FILE, 'w', encoding='utf-8') as f: json.dump(report_data, f, indent=4) print(f"JSON report successfully written to {REPORT_JSON_FILE}") except PermissionError: print(f"Error: Permission denied to write JSON report to {REPORT_JSON_FILE}. Check directory permissions.") except IOError as e: print(f"Error: An I/O error occurred writing JSON report file {REPORT_JSON_FILE}: {e}") except TypeError as e: print(f"Error: Data cannot be serialized to JSON: {e}") except Exception as e: print(f"Unexpected error generating JSON report: {e} ({type(e).__name__})") # --- Main Execution --- # (Main block remains the same as in the previous workshop) if __name__ == "__main__": print("Gathering system information from /proc...") cpu_details = parse_cpu_info() memory_details = parse_memory_info() report = { "timestamp": datetime.now().isoformat(), "cpu": cpu_details, "memory": memory_details } print("Information gathered. Writing reports...") write_text_report(report) write_json_report(report) print("\nScript finished.")
- Add explicit
-
Save and Exit: Save the modified
sys_reporter.py
. -
Test the Enhanced Script:
- Normal Run: Execute
python sys_reporter.py
. It should function correctly, potentially showing more specific warnings if/proc
files have unusual lines, but likely succeeding overall. - Simulate Read Permission Error:
- (Requires care) Temporarily change permissions on a proc file (choose one carefully, e.g.,
/proc/uptime
is usually safe to make unreadable temporarily). sudo chmod 000 /proc/uptime
- Modify the script temporarily to try and parse
/proc/uptime
instead of/proc/cpuinfo
or/proc/meminfo
. - Run
python sys_reporter.py
. Observe the "Permission denied" error specifically for reading that file. - Crucially, restore permissions:
sudo chmod 644 /proc/uptime
and revert the script changes.
- (Requires care) Temporarily change permissions on a proc file (choose one carefully, e.g.,
- Simulate Write Permission Error:
- Create dummy output files owned by root in your project directory:
sudo touch dummy_report.txt dummy_report.json && sudo chown root:root dummy_report.* && sudo chmod 600 dummy_report.*
- Modify the
REPORT_TEXT_FILE
andREPORT_JSON_FILE
constants in the script temporarily to point to these dummy files. - Run
python sys_reporter.py
. Observe the "Permission denied" errors specifically for writing the report files. - Clean up:
sudo rm dummy_report.txt dummy_report.json
and revert the constants in the script.
- Create dummy output files owned by root in your project directory:
- Simulate Malformed Data (If possible): If you can safely copy and edit a
/proc
file (like in the previous workshop), introduce errors (e.g., remove values after colons, put text where numbers are expected) and observe the more specificIndexError
orValueError
warnings during parsing.
- Normal Run: Execute
Code Explanation of Changes:
- Specific Exception Handling: Replaced or augmented generic
except Exception:
with more specific types likePermissionError
,IOError
,IndexError
,ValueError
, andTypeError
where appropriate. This allows the program to potentially react differently based on the kind of error. - Granular
try...except
: Added innertry...except
blocks around specific parsing operations (likesplit()
andint()
) within the file reading loops. This makes the parsing more fault-tolerant – a single malformed line won't necessarily stop the entire file from being processed. - Clearer Error Messages: Included the type of exception (
type(e).__name__
) in some generic catch blocks to aid debugging. Added more context to permission errors. - Robust Fallback: Added basic
try...except
around the fallback logic forMemAvailable
->MemFree
to handle potential errors during the fallback attempt itself.
Conclusion: You have significantly improved the robustness of the system information reporter by implementing more specific and granular error handling. The script can now better anticipate and report issues related to file access, permissions, and unexpected data formats within the /proc
files or during output writing. This demonstrates how try...except
blocks are used to build more resilient and user-friendly applications that don't crash easily when encountering non-ideal conditions.
9. Object Oriented Programming OOP Concepts in Python
Object-Oriented Programming (OOP) is a fundamental programming paradigm used extensively in Python and many other modern languages. Instead of just writing sequences of instructions (procedural programming) or focusing solely on functions, OOP structures code around objects. Objects bundle together data (attributes) and behavior (methods) that operate on that data. This approach helps manage complexity, promotes code reuse, and models real-world entities more intuitively.
Classes (class
) The Blueprint
A class acts as a blueprint or template for creating objects. It defines the common structure (attributes) and behaviors (methods) that all objects of that type will share.
- Definition: Uses the
class
keyword, followed by the class name (conventionallyCamelCase
), and a colon. The class body is indented. - Attributes: Variables associated with the class or its instances.
- Class Attributes: Defined directly within the class body, outside any methods. They are shared by all instances of the class. Changes to a class attribute affect all instances (unless an instance has shadowed it).
- Instance Attributes: Specific to each individual object (instance) created from the class. They are typically defined within the
__init__
method usingself.attribute_name = value
.
- Methods: Functions defined inside a class. They define the behavior of objects.
- Instance Methods: The most common type. They operate on the data (instance attributes) of a specific instance. Their first parameter is always
self
(by convention), which refers to the instance calling the method. - Class Methods: Operate on the class itself, rather than specific instances. They are defined using the
@classmethod
decorator and their first parameter iscls
(by convention), referring to the class. Useful for factory methods or accessing class attributes. - Static Methods: Behave like regular functions but are defined within the class's namespace for organizational purposes. They don't operate on instance (
self
) or class (cls
) state. Defined using the@staticmethod
decorator.
- Instance Methods: The most common type. They operate on the data (instance attributes) of a specific instance. Their first parameter is always
__init__
Method (Constructor/Initializer): A special instance method called automatically whenever you create a new instance of the class. Its primary role is to initialize the instance attributes.self
Parameter: Represents the instance of the class itself within instance methods. Python passes the instance automatically as the first argument when you callmy_object.method()
.
Syntax Example:
class LinuxServer:
"""Represents a Linux server with basic attributes and actions."""
# Class attribute (shared by all LinuxServer instances)
operating_system = "Linux"
default_ping_timeout = 1 # seconds
def __init__(self, hostname, ip_address, role="generic"):
"""Initializes a new LinuxServer instance."""
print(f"Initializing server: {hostname}")
# Instance attributes (specific to this object)
self.hostname = hostname # Required attribute
self.ip_address = ip_address # Required attribute
self.role = role # Optional attribute with default
self.is_online = False # Initial state
self._cpu_load = 0.0 # Convention: '_' suggests internal use
# Instance method (operates on 'self')
def check_status(self):
"""Simulates checking the server's online status."""
print(f"Checking status for {self.hostname} ({self.ip_address})...")
# In reality, this would involve network calls (e.g., ping)
# Simulate based on some logic or random chance for demo
import random
if random.random() > 0.2: # 80% chance of being online
self.is_online = True
print(f" -> {self.hostname} is Online.")
else:
self.is_online = False
print(f" -> {self.hostname} is Offline.")
return self.is_online
# Another instance method
def display_info(self):
"""Prints the server's information."""
status = "Online" if self.is_online else "Offline"
print(f"\n--- Server Info ---")
print(f" Hostname: {self.hostname}")
print(f" IP Address: {self.ip_address}")
print(f" Role: {self.role}")
print(f" OS: {self.operating_system}") # Accessing class attribute via self
print(f" Status: {status}")
print(f"--------------------")
# Instance method modifying state
def set_role(self, new_role):
"""Updates the server's role."""
print(f"Changing role of {self.hostname} from '{self.role}' to '{new_role}'.")
self.role = new_role
# Class method (operates on 'cls')
@classmethod
def get_default_os(cls):
"""Returns the default operating system defined for the class."""
return cls.operating_system
@classmethod
def create_web_server(cls, hostname, ip_address):
"""Factory method to create a server instance with role 'web'."""
print(f"Using factory to create web server: {hostname}")
# Calls the __init__ method of the class (cls)
return cls(hostname=hostname, ip_address=ip_address, role="web")
# Static method (no 'self' or 'cls')
@staticmethod
def validate_ip_address(ip_str):
"""Simple validation for an IP address format (basic example)."""
parts = ip_str.split('.')
if len(parts) == 4:
try:
# Check if all parts are integers between 0 and 255
return all(0 <= int(part) <= 255 for part in parts)
except ValueError:
return False
return False
# --- Special Methods (Dunder Methods) ---
def __str__(self):
"""User-friendly string representation (used by print())."""
status = "Online" if self.is_online else "Offline"
return f"LinuxServer(hostname='{self.hostname}', ip='{self.ip_address}', status='{status}')"
def __repr__(self):
"""Official, unambiguous string representation (for debugging)."""
# Aims to show how the object could be created
return f"LinuxServer(hostname='{self.hostname}', ip_address='{self.ip_address}', role='{self.role}')"
Objects (Instances)
An object (or instance) is a concrete realization created from a class blueprint. Each object holds its own values for the instance attributes defined in __init__
, but shares the methods and class attributes defined by the class.
Creating Instances: Call the class name as if it were a function, passing the arguments required by __init__
.
# Create instances (objects) of the LinuxServer class
server1 = LinuxServer("db01.example.com", "192.168.1.10", role="database")
server2 = LinuxServer(hostname="app01.internal", ip_address="10.0.5.20") # Uses default role 'generic'
web_server = LinuxServer.create_web_server("web01.prod", "172.16.0.5") # Using class method factory
# Accessing instance attributes
print(f"\nServer 1 Hostname: {server1.hostname}") # db01.example.com
print(f"Server 2 Role: {server2.role}") # generic
print(f"Web Server Role: {web_server.role}") # web
# Accessing class attributes (via instance or class)
print(f"Server 1 OS: {server1.operating_system}") # Linux
print(f"Server 2 OS: {server2.operating_system}") # Linux
print(f"Default OS via Class: {LinuxServer.operating_system}") # Linux
# Calling instance methods
print("\nChecking server statuses:")
server1.check_status() # Modifies server1.is_online
server2.check_status()
web_server.check_status()
# Displaying info uses the instance's current state
server1.display_info()
server2.display_info()
# Modifying state via method
server2.set_role("application")
server2.display_info() # Role is now 'application'
# Calling class method
print(f"\nDefault OS from class method: {LinuxServer.get_default_os()}")
# Calling static method
ip_to_check = "192.168.1.256"
is_valid = LinuxServer.validate_ip_address(ip_to_check)
print(f"Is IP '{ip_to_check}' valid? {is_valid}") # False
is_valid = LinuxServer.validate_ip_address("10.0.0.1")
print(f"Is IP '10.0.0.1' valid? {is_valid}") # True
# Using special methods implicitly
print("\nUsing special methods:")
print(server1) # Calls server1.__str__()
print([server1, server2]) # Calls __repr__() for items in the list representation
Key OOP Principles
- Encapsulation: Bundling data (attributes) and the methods that operate on that data within a single unit (the object/class). This hides the internal implementation details from the outside world. Access to the internal state is typically controlled via methods (getters/setters) or property decorators (more advanced), though Python relies heavily on convention (like the
_
prefix for internal attributes). This prevents accidental modification of internal state and makes the object easier to use correctly. -
Inheritance: A mechanism where a new class (called a subclass or derived class) can inherit attributes and methods from an existing class (called a superclass or base class). This establishes an "is-a" relationship (e.g., a
WebServer
is aLinuxServer
) and promotes code reuse. The subclass can extend the superclass by adding new attributes/methods, or override inherited methods to provide specialized behavior. Thesuper()
function is used in the subclass's__init__
to call the parent class's__init__
.class WebServer(LinuxServer): # Inherits from LinuxServer """A specialized LinuxServer for web hosting.""" def __init__(self, hostname, ip_address, web_software="nginx"): # Call the parent class's initializer FIRST super().__init__(hostname, ip_address, role="web") # Pass relevant args up # Add attributes specific to WebServer self.web_software = web_software print(f" -> Configured with web software: {self.web_software}") # Override the display_info method def display_info(self): # Call the parent's method first to print basic info super().display_info() # Add webserver-specific info print(f" Web Software: {self.web_software}") print(f"--------------------") # Closing line for this method's output # Add a new method specific to WebServer def restart_web_service(self): """Simulates restarting the web server software.""" print(f"Restarting {self.web_software} service on {self.hostname}...") # ... add actual subprocess call here in a real scenario ... self.check_status() # Check status after restart # --- Usage --- web02 = WebServer("web02.prod", "172.16.0.6", web_software="apache") web02.display_info() # Calls the OVERRIDDEN method in WebServer web02.restart_web_service() # Calls method specific to WebServer
-
Polymorphism: Literally means "many forms". It allows objects of different classes (often related by inheritance) to respond to the same method call in their own unique way. In the example above, both a
LinuxServer
object and aWebServer
object can respond to.display_info()
, but the output will differ becauseWebServer
overrides the method. This enables writing generic code that operates on objects based on a shared interface (common method names) without needing to know the object's exact class.def print_server_details(server_obj): # This function works with ANY object that has a display_info method print("\n--- Details via Polymorphic Function ---") server_obj.display_info() # Calls the appropriate version based on object type print("--------------------------------------") generic_server = LinuxServer("generic01", "10.1.1.1") web_server_instance = WebServer("web_app", "10.2.2.2") print_server_details(generic_server) # Calls LinuxServer.display_info print_server_details(web_server_instance) # Calls WebServer.display_info
-
Abstraction: Hiding the complex implementation details of an object and exposing only a simplified interface (public methods and attributes) for interaction. Users of an object don't need to know how it works internally, only what it can do through its interface. Inheritance and Encapsulation work together to achieve abstraction. Abstract Base Classes (ABCs) from the
abc
module provide a more formal way to define abstract interfaces in Python (more advanced topic).
Magic Methods (__*__
)
Methods with double underscores at the beginning and end (e.g., __init__
, __str__
, __repr__
, __len__
, __add__
, __eq__
) are called special methods or "dunder" (double underscore) methods. They allow your custom objects to integrate seamlessly with Python's built-in syntax and functions.
__init__(self, ...)
: Object initialization (constructor).__str__(self)
: Returns a user-friendly string representation (used byprint()
,str()
).__repr__(self)
: Returns an unambiguous, official string representation, often code to recreate the object (used by the REPL,repr()
, debugging).__len__(self)
: Returns the "length" of the object (used bylen()
).__eq__(self, other)
: Implements equality comparison (==
).__add__(self, other)
: Implements addition (+
).- And many others for arithmetic, comparison, attribute access, iteration, etc.
You usually don't call these methods directly (e.g., my_obj.__str__()
). Instead, you use the corresponding Python syntax or function (e.g., print(my_obj)
), and Python calls the appropriate special method behind the scenes.
OOP is a powerful paradigm for building structured, maintainable, and reusable code, especially as applications grow in complexity.
Workshop Object Oriented Programming
This workshop applies OOP concepts to model configuration files often found on Linux systems (like simplified INI files). We'll create a ConfigFile
class to represent a configuration file, allowing us to load data from a file, get/set configuration values, and save changes back.
Goal: Practice defining a class with __init__
, instance attributes, instance methods for loading, saving, getting, and setting data, and using special methods like __str__
and __len__
.
Project: Simple INI-like Configuration File Manager
Simulated INI Format: We'll handle a very simple format like:
# This is a comment
key1 = value1
another_key = some other value
spaced key = value with spaces # Inline comments ignored
# Empty lines ignored
Steps:
-
Set Up:
- Navigate to your project directory.
- Ensure your virtual environment is active.
- Create a new Python file named
config_manager.py
(nano config_manager.py
). - Create a sample configuration file named
sample.conf
(nano sample.conf
) with the following content: - Save
sample.conf
.
-
Define the
ConfigFile
Class inconfig_manager.py
:- Add the following code to define the class structure.
# File: config_manager.py """ Object-Oriented approach to managing simple 'key = value' config files. Handles basic loading, getting, setting, and saving. Ignores comments (#) and empty lines. """ import os # For file path operations import shutil # For file copying (backup) class ConfigFile: """Manages reading and writing simple key-value configuration files.""" def __init__(self, filepath): """ Initializes the ConfigFile object. Doesn't load automatically. Args: filepath (str): The path to the configuration file. """ if not isinstance(filepath, str) or not filepath: raise ValueError("Filepath must be a non-empty string.") self.filepath = filepath # Use a dictionary to store the configuration key-value pairs self._config_data = {} # Convention: '_' suggests internal use self._loaded = False # Flag to track if data has been loaded print(f"ConfigFile object created for '{self.filepath}' (not loaded yet).") def load(self): """ Loads configuration data from the file specified by self.filepath. Overwrites any currently held configuration data. Returns: bool: True if loading was successful (or file not found but handled), False if a critical error occurred (e.g., PermissionError). """ print(f"Attempting to load configuration from '{self.filepath}'...") self._config_data = {} # Clear existing data before loading try: with open(self.filepath, 'r', encoding='utf-8') as f: line_num = 0 for line in f: line_num += 1 line = line.strip() # Ignore empty lines and comments if not line or line.startswith('#'): continue # Ignore lines without an '=' sign if '=' not in line: print(f" Warning: Skipping malformed line {line_num} (no '='): '{line}'") continue # Split at the first '=' sign key, value = line.split('=', 1) key = key.strip() value = value.strip() # Optional: Strip inline comments from value (if needed) if '#' in value: value = value.split('#', 1)[0].strip() if not key: # Check for empty key after stripping print(f" Warning: Skipping line {line_num} (empty key): '{line}'") continue # print(f" Loaded: '{key}' = '{value}'") # Verbose logging self._config_data[key] = value # Store key-value pair self._loaded = True print(f"Successfully loaded {len(self._config_data)} key-value pairs.") return True # Indicate success except FileNotFoundError: print(f"Info: File not found: '{self.filepath}'. Starting with empty configuration.") self._loaded = True # Consider it "loaded" but empty return True # Not a critical error for loading logic here except PermissionError: print(f"Error: Permission denied reading file: '{self.filepath}'. Cannot load.") self._loaded = False return False # Indicate critical failure except IOError as e: print(f"Error: An I/O error occurred reading file: {e}") self._loaded = False return False except Exception as e: print(f"An unexpected error occurred during loading: {e}") self._loaded = False return False def get(self, key, default=None): """ Retrieves the value for a given key. Args: key (str): The configuration key to retrieve. default: The value to return if the key is not found. Defaults to None. Returns: The value associated with the key (as str), or the default value. """ if not self._loaded: print("Warning: Configuration not loaded. Call load() first or set values.") # Still allow getting from potentially set (but not loaded) data return self._config_data.get(str(key).strip(), default) def get_bool(self, key, default=False): """Retrieves a value and attempts to interpret it as a boolean.""" value_str = self.get(key) if value_str is None: return default # Handle common boolean string representations return value_str.lower() in ['true', '1', 't', 'yes', 'y', 'on'] def get_int(self, key, default=0): """Retrieves a value and attempts to interpret it as an integer.""" value_str = self.get(key) if value_str is None: return default try: return int(value_str) except (ValueError, TypeError): print(f"Warning: Value for key '{key}' ('{value_str}') is not a valid integer. Returning default {default}.") return default def get_list(self, key, default=None, separator=','): """Retrieves a value and splits it into a list.""" value_str = self.get(key) if value_str is None or value_str == '': return default if default is not None else [] # Return list of stripped strings return [item.strip() for item in value_str.split(separator)] def set(self, key, value): """ Sets or updates a configuration key-value pair. Value is stored as string. Args: key (str): The configuration key. value: The value to set (will be converted to string). """ if not isinstance(key, str) or not key.strip(): print("Error: Key must be a non-empty string.") return # Convert value to string to ensure consistency when saving value_str = str(value).strip() key_str = key.strip() print(f"Setting '{key_str}' = '{value_str}'") self._config_data[key_str] = value_str # Mark as potentially modified if needed (not implemented here) if not self._loaded: print("Info: Setting value, but configuration was not loaded from file initially.") self._loaded = True # Assume we now have data def save(self, backup=False): """ Saves the current configuration data back to the file. Args: backup (bool): If True, create a backup (.bak) of the original file before saving. Returns: bool: True on successful save, False otherwise. """ if not self._config_data and not self._loaded: print("Warning: No configuration data loaded or set. Nothing to save.") return False print(f"Attempting to save configuration to '{self.filepath}'...") target_dir = os.path.dirname(self.filepath) or '.' if not os.path.isdir(target_dir): print(f"Error: Target directory '{target_dir}' does not exist. Cannot save.") return False # --- Optional Backup --- original_exists = os.path.exists(self.filepath) if backup and original_exists: backup_filepath = self.filepath + ".bak" try: shutil.copy2(self.filepath, backup_filepath) # copy2 preserves metadata print(f" Backup created: '{backup_filepath}'") except Exception as backup_err: print(f" Warning: Could not create backup file '{backup_filepath}': {backup_err}") cont = input(" Continue saving without backup? (y/N): ").strip().lower() if cont != 'y': print("Save operation cancelled.") return False # --- Write Data --- try: with open(self.filepath, 'w', encoding='utf-8') as f: # Write key-value pairs, one per line for key, value in self._config_data.items(): f.write(f"{key} = {value}\n") print(f"Successfully saved {len(self._config_data)} key-value pairs.") return True except PermissionError: print(f"Error: Permission denied writing file: '{self.filepath}'.") # Optional: Restore backup if write fails? More complex. return False except IOError as e: print(f"Error: An I/O error occurred writing file: {e}") return False except Exception as e: print(f"An unexpected error occurred during saving: {e}") return False # --- Special Methods --- def __len__(self): """Returns the number of key-value pairs loaded/set.""" return len(self._config_data) def __str__(self): """User-friendly summary string.""" status = "Loaded" if self._loaded else "Not Loaded" return f"ConfigFile(path='{self.filepath}', items={len(self)}, status='{status}')" def __repr__(self): """Official representation string.""" return f"ConfigFile(filepath='{self.filepath}')" def __contains__(self, key): """Allows using 'key in config_object'.""" # Checks loaded/set data regardless of _loaded status return str(key).strip() in self._config_data def __getitem__(self, key): """Allows using config_object[key] for getting values (raises KeyError if not found).""" key_str = str(key).strip() if key_str in self._config_data: return self._config_data[key_str] else: # Raise KeyError only if config was loaded or values were set if not self._loaded: print("Warning: Configuration not loaded. Call load() first or set values.") raise KeyError(f"Configuration key '{key_str}' not found.") def __setitem__(self, key, value): """Allows using config_object[key] = value for setting values.""" # Reuse the existing 'set' method for logic and validation self.set(key, value) def items(self): """Return a view of the items like a dictionary.""" return self._config_data.items() # --- End of Class Definition --- # --- Workshop Usage Example --- if __name__ == "__main__": print("--- Config File Manager ---") # Create an instance for our sample file config_file_path = "sample.conf" config = ConfigFile(config_file_path) print(config) # Uses __str__ # Load the data load_success = config.load() if load_success: print(f"\nNumber of config items loaded: {len(config)}") # Uses __len__ # Get values using various methods hostname = config.get("hostname", "default_host") port = config.get_int("port", 80) # Use get_int is_debug = config.get_bool("debug_mode") # Use get_bool admin_list = config.get_list("admin_users") # Use get_list non_existent = config.get("api_key") ip_addr = config["ip_address"] # Use __getitem__ print(f"\nHostname: {hostname}") print(f"Port: {port} (type: {type(port)})") print(f"Is Debug: {is_debug} (type: {type(is_debug)})") print(f"Admin Users: {admin_list} (type: {type(admin_list)})") print(f"API Key: {non_existent}") print(f"IP Address (via []): {ip_addr}") # Check for key existence (__contains__) if "database_url" not in config: print("'database_url' setting not found.") # Set/Update some values config["port"] = 9090 # Use __setitem__ -> set() -> stores "9090" config.set("debug_mode", "yes") # Use set() -> stores "yes" config["new_feature_flag"] = True # Use __setitem__ -> set() -> stores "True" config["allowed_hosts"] = "host1.com, host2.net, localhost" print("\nConfiguration after updates:") for k, v in config.items(): # Use items() method print(f" '{k}' = '{v}'") # Save the changes back to the file, creating a backup print("\nSaving changes...") save_success = config.save(backup=True) if save_success: print("\nVerify 'sample.conf' and 'sample.conf.bak' in your directory.") else: print("\nFailed to save configuration.") else: print("\nCould not load configuration to perform operations.") # Example: Handling a non-existent file on load print("\n--- Testing Non-Existent File ---") non_existent_config = ConfigFile("no_such_file.conf") non_existent_config.load() # Should print info message, return True print(f"Items in non-existent config: {len(non_existent_config)}") # 0 non_existent_config["test"] = "value" # Set a value print(f"Item count after set: {len(non_existent_config)}") # 1 non_existent_config.save() # Should create the file print("\n--- Config Manager Finished ---")
-
Save and Exit: Save the
config_manager.py
file. -
Run the Script:
- Execute the script from your terminal:
- Observe:
- The script creates the
ConfigFile
object. - It loads
sample.conf
. - It prints the number of items loaded.
- It retrieves values using different methods (
get
,get_bool
,get_int
,get_list
,[]
). Note the types printed forport
andis_debug
. - It modifies the
port
anddebug_mode
, and addsnew_feature_flag
andallowed_hosts
. - It saves the configuration, attempting a backup.
- It then tests loading a non-existent file, setting a value, and saving (which should create the file).
- The script creates the
-
Examine Files:
cat sample.conf
: Check the updated content. Note that all values are stored as strings in the file.ls -l sample.conf*
: Verify thatsample.conf.bak
was created.cat sample.conf.bak
: View the original content.cat no_such_file.conf
: Check that this file was created by the save operation in the non-existent file test.
Code Explanation Recap:
ConfigFile
Class: Encapsulated the file path and configuration data (_config_data
).__init__
: Stored filepath, initialized_config_data
as empty, set_loaded
flag.load()
: Handled file opening, line iteration, comment/blank skipping, parsing key-value pairs, and error handling (FileNotFoundError
,PermissionError
, etc.). Updated_config_data
and_loaded
.get()
variations (get
,get_bool
,get_int
,get_list
): Provided convenient ways to retrieve values, safely handle missing keys using defaults, and perform common type conversions with basic error checking.set()
: Provided the primary mechanism to add/update key-value pairs, ensuring values were stored as strings.save()
: Handled writing the current_config_data
back to the file, including optional backup logic usingshutil
. Added check for target directory existence.- Special Methods (
__len__
,__str__
,__repr__
,__contains__
,__getitem__
,__setitem__
): Made theConfigFile
object behave more intuitively, allowing operations likelen(config)
,print(config)
,key in config
, andconfig[key]
. items()
: Provided a dictionary-like way to iterate over key-value pairs.- Main Block: Demonstrated the full lifecycle: creating an object, loading, getting/setting values through various methods/operators, saving, and handling potential load failures. Included a test for loading a non-existent file.
Conclusion: You've built a reusable ConfigFile
class using OOP. It encapsulates the logic for handling simple configuration files, providing a cleaner interface than standalone functions would offer for managing this stateful data. You practiced implementing core methods (load
, save
, get
, set
) and several special methods to enhance usability.
10. Working with Linux Processes and System Calls
A common requirement in Linux scripting and automation is the ability to run external commands, manage system processes, and interact with the operating system at a lower level. Python provides robust modules, primarily subprocess
and os
, to achieve this. While direct system call invocation is rare and complex in pure Python, these modules provide high-level abstractions over the underlying Linux system calls.
The subprocess
Module The Standard for Running Commands
The subprocess
module is the modern, flexible, and secure way to spawn new processes (run external commands), connect to their input/output/error streams (pipes), and retrieve their exit codes. It replaces older, less secure, and less flexible functions like os.system
.
Key Function: subprocess.run()
This is the recommended function for most common cases where you need to run a command and wait for it to complete.
- Syntax:
subprocess.run(args, *, capture_output=False, check=False, text=False, shell=False, timeout=None, input=None, ...)
- Arguments (
args
): The command to execute.- Recommended: Pass as a list of strings, where the first element is the command and subsequent elements are its arguments (e.g.,
['ls', '-l', '/home']
). This is used whenshell=False
. - If
shell=True
, pass the entire command line as a single string (e.g.,"ls -l /home | grep python"
).
- Recommended: Pass as a list of strings, where the first element is the command and subsequent elements are its arguments (e.g.,
shell=False
(Default & Recommended): Executes the command directly without involving an intermediate system shell (like/bin/sh
). Arguments are passed safely. This avoids shell injection vulnerabilities.shell=True
(Use with Extreme Caution): Executes the command through the system shell. Allows shell features like pipes (|
), redirection (>
), wildcards (*
), and variable expansion ($HOME
). Highly insecure if any part of the command string comes from untrusted input, as it allows command injection. Only use if absolutely necessary and if you fully control or sanitize the command string.capture_output=True
: Captures standard output (stdout) and standard error (stderr) from the command. IfFalse
(default), output goes to the parent's stdout/stderr (usually the terminal). Equivalent to settingstdout=subprocess.PIPE, stderr=subprocess.PIPE
.text=True
: Decodes captured stdout and stderr from bytes into strings using a default or specified encoding (encoding=...
). IfFalse
(default), stdout/stderr are returned asbytes
objects.check=False
(Default): Does not raise an error if the command finishes with a non-zero exit code (which typically indicates failure).check=True
: If the command finishes with a non-zero exit code, raises asubprocess.CalledProcessError
exception. Useful for stopping the script immediately on command failure.input=None
: Allows passing data (as bytes, or string iftext=True
) to the command's standard input.timeout=None
: Sets a maximum time in seconds to wait for the command to complete. Raisessubprocess.TimeoutExpired
if exceeded.- Return Value: A
CompletedProcess
object containing attributes like:args
: The arguments used.returncode
: The command's integer exit code (0 usually means success).stdout
: Captured stdout (bytes or string, orNone
).stderr
: Captured stderr (bytes or string, orNone
).
Examples using subprocess.run()
:
import subprocess
import shlex # Useful for safely splitting command strings if needed
import os # For os.path.expanduser
# --- Example 1: Run 'ls -l', output to terminal, check for errors ---
print("--- Example 1: ls -l / (check=True) ---")
try:
# Command as list, check=True ensures error if ls fails
subprocess.run(['ls', '-l', '/'], check=True)
except FileNotFoundError:
print("Error: 'ls' command not found in PATH.")
except subprocess.CalledProcessError as e:
# This catches non-zero return codes because check=True
print(f"Error: 'ls' command failed with return code {e.returncode}.")
# stderr might be captured automatically in the exception object if capture_output was True
# print(f"Stderr: {e.stderr}")
except Exception as e:
print(f"An unexpected error occurred: {e}")
print("-" * 20)
# --- Example 2: Capture output of 'uname -a' as text ---
print("\n--- Example 2: uname -a (capture text output) ---")
try:
result = subprocess.run(['uname', '-a'], capture_output=True, text=True, check=True)
print(f"Command: {' '.join(result.args)}") # Display the command run
print(f"Return Code: {result.returncode}")
print("Stdout:")
print(result.stdout.strip()) # Access captured stdout string
if result.stderr: # Check if anything was printed to stderr
print("Stderr:")
print(result.stderr.strip())
except (FileNotFoundError, subprocess.CalledProcessError) as e:
print(f"Error running 'uname': {e}")
# If CalledProcessError, stderr might be in e.stderr
if hasattr(e, 'stderr') and e.stderr:
print(f"Stderr output from failed command:\n{e.stderr.strip()}")
print("-" * 20)
# --- Example 3: Run a failing command without check=True ---
print("\n--- Example 3: ls /nonexistent (check=False) ---")
try:
# check=False means CalledProcessError won't be raised for non-zero exit
result = subprocess.run(['ls', '/non/existent/path'], capture_output=True, text=True, check=False)
print(f"Command ran, Return Code: {result.returncode}") # Should be non-zero
if result.stdout:
print("Stdout (unexpected for this error):")
print(result.stdout.strip())
if result.stderr:
print("Stderr (expected for this error):")
print(result.stderr.strip()) # Should contain 'ls: cannot access...'
except FileNotFoundError:
print("Error: 'ls' command not found.")
except Exception as e:
print(f"An unexpected error: {e}")
print("-" * 20)
# --- Example 4: Using shell=True (Cautiously!) for a pipeline ---
print("\n--- Example 4: Pipeline with shell=True ---")
# Find python processes using ps and grep
command_string = "ps aux | grep '[p]ython'" # '[p]ython' avoids grep matching itself
# Note: This is potentially fragile and better done with libraries like psutil
try:
# Pass command as a single string when shell=True
result = subprocess.run(command_string, shell=True, capture_output=True, text=True, check=False)
print(f"Shell Command: {result.args}")
print(f"Return Code: {result.returncode}") # grep is 0 if found, 1 if not
if result.returncode == 0:
print("Matching Python processes found:")
print(result.stdout.strip())
elif result.returncode == 1:
print("No running Python processes found by grep.")
else:
print("Error during pipeline execution:")
print(result.stderr.strip())
except Exception as e:
print(f"Unexpected error running shell command: {e}")
print("-" * 20)
# --- Example 5: Passing input to 'wc -l' ---
print("\n--- Example 5: Passing input to wc -l ---")
lines_to_count = "First line.\nSecond line.\nThird line."
try:
# Pass the string via the 'input' argument
result = subprocess.run(['wc', '-l'], input=lines_to_count, capture_output=True, text=True, check=True)
line_count = result.stdout.strip()
print(f"Input text:\n{lines_to_count}")
print(f"Line count from wc: {line_count}") # Should be 3
except (FileNotFoundError, subprocess.CalledProcessError) as e:
print(f"Error running 'wc -l': {e}")
print("-" * 20)
Security Note (shell=True
): Never construct a command string for shell=True
using unvalidated input from users, files, or the network. An attacker could provide input like "some_argument; rm -rf /"
leading to command injection. Always prefer shell=False
and passing arguments as a list. If shell features are essential, use shlex.quote()
on any variable parts to make them safe before embedding them in the command string.
Alternative: subprocess.Popen()
For more advanced scenarios requiring non-blocking execution or complex interaction with the process's stdin/stdout/stderr while it's running (e.g., interactive sessions), subprocess.Popen()
provides more control. It starts the process and returns a Popen
object immediately. You then need to manage communication (.communicate()
, .stdin.write()
, .stdout.read()
) and waiting (.wait()
, .poll()
) manually. This is more complex than run()
.
The os
Module Process and System Interaction
The os
module provides functions that interact more directly with the operating system, including information about the current Python process and some basic process/system management tools.
- Process Information:
os.getpid()
: Get current Process ID.os.getppid()
: Get Parent Process ID.os.getuid()
,os.getgid()
: Get real User/Group ID.os.geteuid()
,os.getegid()
: Get effective User/Group ID.os.getlogin()
: Get login name of user on controlling terminal (might fail if no terminal).
- System Information:
os.uname()
: Get system info (kernel name, hostname, release, version, machine architecture) as an object.os.environ
: Dictionary-like object representing environment variables. Useos.environ.get('VAR_NAME', 'default')
for safe access.
- Working Directory:
os.getcwd()
: Get Current Working Directory.os.chdir(path)
: Change Current Working Directory.
- Process Management (Use with Caution):
os.kill(pid, signal)
: Send a specifiedsignal
(e.g.,signal.SIGTERM
,signal.SIGKILL
from thesignal
module) to the process with IDpid
. Requires appropriate permissions. Misuse can kill critical processes.os.system(command)
: Executes a command in a subshell. Discouraged. Less secure and flexible thansubprocess
. Returns shell's exit status, difficult to capture command output reliably.
import os
import signal # Needed for signal constants like SIGTERM
print("\n--- Using the 'os' Module ---")
print(f"Current Process ID (PID): {os.getpid()}")
print(f"Parent Process ID (PPID): {os.getppid()}")
print(f"Current User ID (UID): {os.getuid()}")
print(f"Current Group ID (GID): {os.getgid()}")
try:
print(f"Login Name: {os.getlogin()}")
except OSError as e:
print(f"Login Name: Not available ({e})")
print(f"Current Working Directory: {os.getcwd()}")
uname_res = os.uname()
print(f"System: {uname_res.sysname}, Release: {uname_res.release}, Machine: {uname_res.machine}")
# Access environment variable
path_var = os.environ.get('PATH')
print(f"PATH environment variable starts with: {path_var[:50]}...") # Print first 50 chars
# Example of using os.kill (Conceptual - DO NOT RUN lightly)
# target_pid = 12345 # Replace with a specific, non-critical PID you own for testing
# try:
# print(f"Attempting to send SIGTERM (15) to PID {target_pid}")
# # os.kill(target_pid, signal.SIGTERM)
# print("Signal sent (kernel attempted delivery).")
# except ProcessLookupError:
# print(f"Process {target_pid} not found.")
# except PermissionError:
# print(f"Permission denied to signal process {target_pid}.")
# except Exception as e:
# print(f"Error sending signal: {e}")
print("-" * 20)
System Calls in Python
Python itself doesn't provide a direct, simple way to invoke arbitrary Linux system calls by their number (like the syscall()
function in C). Instead, the necessary system calls are wrapped and abstracted by functions within standard library modules like os
, subprocess
, socket
, io
, select
, threading
, etc.
For example:
os.fork()
uses thefork()
system call.os.execvp()
uses one of theexec()
family system calls.subprocess.run()
often usesfork()
,execve()
,pipe()
,waitpid()
, etc., under the hood.open()
uses theopen()
system call.file.read()
uses theread()
system call.os.kill()
uses thekill()
system call.
For 99% of tasks, using these high-level Python functions is the correct, portable, and safe approach. Attempting direct system calls usually involves using the ctypes
module to interact with the C library (libc
), which is significantly more complex and platform-dependent.
Understanding the relationship between Python modules and underlying Linux system calls provides valuable context, but direct interaction is typically unnecessary.
Workshop Linux Processes and System Calls
This workshop focuses on using the subprocess
module to execute common Linux commands for system monitoring (uptime
, vmstat
), capture their output, and display selected information parsed from the output.
Goal: Practice executing external Linux commands, capturing their output as text, handling potential command errors, and performing basic string parsing on the output.
Project: Simple System Monitor Output Display
Commands to Use:
uptime
: Shows system uptime, load average.vmstat 1 2
: Shows virtual memory statistics (report once after 1 second, then a second time - we'll capture the second report).
Steps:
-
Set Up:
- Navigate to your project directory.
- Ensure your virtual environment is active.
- Create a new Python file named
sys_monitor.py
(nano sys_monitor.py
).
-
Write the Code: Enter the following Python code into
sys_monitor.py
. This script defines helper functions to run commands and parse their specific outputs.# File: sys_monitor.py """ Runs system monitoring commands ('uptime', 'vmstat') using subprocess, parses, and displays selected information. """ import subprocess import shlex # Not strictly needed here as we use lists, but good practice import time import re # Regular expression module for parsing uptime import os # For expanding home directory path if needed (not in this version) def run_command_capture_text(command_list, check=True, timeout=5): """ Helper function to run a command, capture text output, and handle errors. Args: command_list (list): Command and arguments. check (bool): If True, raise CalledProcessError on failure. timeout (int): Timeout in seconds. Returns: tuple: (stdout_str, stderr_str) or (None, error_message_str) on error. Returns (None, None) if command runs successfully but produces no output. """ command_str_display = ' '.join(command_list) print(f"-- Running: `{command_str_display}`") try: result = subprocess.run( command_list, capture_output=True, text=True, # Decode output as text check=check, # Raise error on non-zero exit code if True timeout=timeout ) # Return stripped output, handle potential None if not captured (though capture_output=True) stdout = result.stdout.strip() if result.stdout else None stderr = result.stderr.strip() if result.stderr else None # Check for successful run with no output specifically if result.returncode == 0 and not stdout and not stderr: print(" -> Command ran successfully but produced no output.") return None, None # Indicate no output elif result.returncode != 0 and not check: # Report stderr if check=False and failed print(f" -> Command finished with rc={result.returncode}. Stderr: {stderr or 'None'}") # Even if check=False, we might want to signal 'no useful stdout' return None, stderr # Return None for stdout, stderr for context # Normal successful execution (or check=False with output despite error) return stdout, stderr except FileNotFoundError: err_msg = f"Error: Command '{command_list[0]}' not found." print(err_msg) return None, err_msg except subprocess.TimeoutExpired: err_msg = f"Error: Command '{command_str_display}' timed out after {timeout}s." print(err_msg) return None, err_msg except subprocess.CalledProcessError as e: # Error message includes command, return code, and potentially stderr # This block only runs if check=True and return code is non-zero err_msg = (f"Error: Command '{command_str_display}' failed " f"(rc={e.returncode}):\nStderr: {e.stderr.strip() if e.stderr else 'N/A'}") print(err_msg) # Return None for stdout, and the detailed error message for stderr context return None, err_msg # Return error message instead of raising here except Exception as e: err_msg = f"An unexpected error occurred running '{command_str_display}': {e}" print(err_msg) return None, err_msg def parse_uptime_output(uptime_str): """ Parses the output of the 'uptime' command (basic parsing). Args: uptime_str (str or None): The captured stdout from uptime. Returns: dict or None: Dictionary with 'uptime_duration' and 'load_avg' (list), or None if parsing fails. """ if not uptime_str: return None parsed_data = {'uptime_duration': 'N/A', 'load_avg': []} try: # Example uptime output variants: # 10:30:50 up 1 day, 1:55, 2 users, load average: 0.10, 0.15, 0.20 # 10:35:00 up 50 min, 1 user, load average: 0.05, 0.12, 0.18 # 10:40:00 up 10 days, 15:01, 5 users, load average: 1.10, 1.05, 1.00 # Extract uptime duration part - more robust regex # Looks for 'up' followed by anything until the first comma before 'user(s)' match_up = re.search(r'up\s+(.*?),\s+\d+\s+user', uptime_str) if match_up: parsed_data['uptime_duration'] = match_up.group(1).strip() # Extract load average part using regex match_load = re.search(r'load average:\s*([\d.]+),\s*([\d.]+),\s*([\d.]+)', uptime_str) if match_load: # Convert load averages to floats parsed_data['load_avg'] = [float(match_load.group(i)) for i in range(1, 4)] else: # Handle cases where 'load average:' might be missing (very old systems?) match_load_alt = re.search(r'averages?:\s*([\d.]+),\s*([\d.]+),\s*([\d.]+)', uptime_str) if match_load_alt: parsed_data['load_avg'] = [float(match_load_alt.group(i)) for i in range(1, 4)] if parsed_data['uptime_duration'] == 'N/A' and not parsed_data['load_avg']: print(f" Warning: Could not parse uptime duration or load average from output: '{uptime_str}'") return None # Return None if nothing useful was found return parsed_data except Exception as e: print(f" Error parsing uptime string: {e}") return None def parse_vmstat_output(vmstat_str): """ Parses the output of 'vmstat 1 2' to get memory and cpu stats from the second line of data. Args: vmstat_str (str or None): The captured stdout from vmstat. Returns: dict or None: Dictionary with 'memory_free_kb', 'cpu_user_percent', 'cpu_system_percent', 'cpu_idle_percent', or None if parsing fails. """ if not vmstat_str: return None parsed_data = { 'memory_free_kb': None, 'cpu_user_percent': None, 'cpu_system_percent': None, 'cpu_idle_percent': None } try: # Example vmstat 1 2 output (we care about the LAST data line): # procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- # r b swpd free buff cache si so bi bo in cs us sy id wa st # 0 0 0 4738388 850692 6904404 0 0 0 0 28 41 1 0 99 0 0 # 0 0 0 4738200 850692 6904460 0 0 0 40 141 211 0 0 100 0 0 lines = vmstat_str.strip().split('\n') if len(lines) < 3: # Need at least header line and two data lines print(f" Warning: vmstat output has fewer lines than expected ({len(lines)}).") return None # Find the header line - it's likely the one just before the last line # More robustly, find the line containing 'us sy id' header_line_index = -1 for i, line in enumerate(lines): if 'us' in line and 'sy' in line and 'id' in line and 'free' in line: header_line_index = i break if header_line_index == -1 or header_line_index >= len(lines) - 1: print(" Warning: Could not reliably find vmstat header line.") return None header_line = lines[header_line_index] headers = header_line.split() # Simple split often works here # Get the last data line and split it last_data_line = lines[-1] values = last_data_line.split() # Dynamic column finding based on header (more robust) try: # Find indices by searching the headers list idx_free = headers.index('free') idx_us = headers.index('us') idx_sy = headers.index('sy') idx_id = headers.index('id') except ValueError as ve: print(f" Warning: Missing expected columns in vmstat header: {ve}. Header: {headers}") return None # Check if data line has enough values if len(values) <= max(idx_free, idx_us, idx_sy, idx_id): print(f" Warning: Not enough columns in vmstat data line: {values}") return None # Extract values using found indices and convert to int parsed_data['memory_free_kb'] = int(values[idx_free]) parsed_data['cpu_user_percent'] = int(values[idx_us]) parsed_data['cpu_system_percent'] = int(values[idx_sy]) parsed_data['cpu_idle_percent'] = int(values[idx_id]) return parsed_data except (ValueError, IndexError) as e: # Catch conversion or index errors print(f" Error parsing vmstat data line: {e}") return None except Exception as e: print(f" An unexpected error occurred parsing vmstat: {e}") return None # --- Main Execution --- if __name__ == "__main__": print("===== Simple System Monitor =====") # --- Uptime Check --- print("\n[1] Checking Uptime and Load Average...") uptime_stdout, uptime_stderr = run_command_capture_text(['uptime'], check=True) # Expect success if uptime_stdout: uptime_data = parse_uptime_output(uptime_stdout) if uptime_data: print(f" System Uptime: {uptime_data.get('uptime_duration', 'N/A')}") print(f" Load Average (1m, 5m, 15m): {uptime_data.get('load_avg', 'N/A')}") else: print(" Failed to parse uptime output.") else: print(f" Failed to get uptime output. Error: {uptime_stderr or 'Unknown'}") # --- VMStat Check --- print("\n[2] Checking Memory and CPU Stats (vmstat)...") # vmstat 1 2: Report stats every 1 sec, 2 times. We parse the second report. vmstat_stdout, vmstat_stderr = run_command_capture_text(['vmstat', '1', '2'], check=True) if vmstat_stdout: vmstat_data = parse_vmstat_output(vmstat_stdout) if vmstat_data: free_kb = vmstat_data.get('memory_free_kb', 'N/A') cpu_us = vmstat_data.get('cpu_user_percent', 'N/A') cpu_sy = vmstat_data.get('cpu_system_percent', 'N/A') cpu_id = vmstat_data.get('cpu_idle_percent', 'N/A') print(f" Memory Free: {free_kb} KiB") print(f" CPU Usage (%): User={cpu_us}, System={cpu_sy}, Idle={cpu_id}") else: print(" Failed to parse vmstat output.") else: print(f" Failed to get vmstat output. Error: {vmstat_stderr or 'Unknown'}") # --- Example Failing Command (Handled by Helper) --- print("\n[3] Checking a non-existent file (expect failure)...") # check=False allows us to see the error message returned by the helper _, ls_stderr = run_command_capture_text(['ls', '/non/existent/file/path'], check=False) # Error message is already printed inside run_command_capture_text print("\n===== Monitoring Checks Complete =====")
-
Save and Exit: Save the
sys_monitor.py
file. -
Run the Script:
- Execute the script from your terminal (ensure virtual env is active):
- Observe:
- The script will print messages indicating which command (
uptime
,vmstat
,ls
) it is running. - It will then print the parsed information: system uptime duration, load average, free memory, and CPU percentages.
- For the failing
ls
command, the helper functionrun_command_capture_text
will print the error message (stderr fromls
) becausecheck=False
was used. - If commands like
uptime
orvmstat
are not installed (unlikely on most Linux systems), you will see the "Command not found" error from the helper function.
- The script will print messages indicating which command (
-
Examine the Output: Compare the parsed output printed by the script to the raw output you would get by running
uptime
andvmstat 1 2
directly in your terminal. Verify that the parsing logic correctly extracts the intended information.
Code Explanation Recap:
run_command_capture_text(command_list, ...)
: A reusable helper function to execute commands usingsubprocess.run
. It standardizes capturing text output and handling common errors (FileNotFoundError
,TimeoutExpired
,CalledProcessError
). It returns a tuple(stdout, stderr)
, where eitherstdout
or both can beNone
if the command failed or produced no output. The error message itself is returned as the second element in the tuple upon error.parse_uptime_output(uptime_str)
: Takes the raw string output fromuptime
. Uses there
(regular expression) module for more flexible parsing to find the uptime duration and the load average values. Returns a dictionary containing the parsed data orNone
.parse_vmstat_output(vmstat_str)
: Takes the raw string output fromvmstat 1 2
. It splits the output into lines, attempts to locate the header row to dynamically find column indices (making it more robust against variations invmstat
output formatting), and extracts the numeric values for free memory and CPU percentages from the last data line. Returns a dictionary orNone
. Includes basic error handling for parsing issues.- Main Block (
if __name__ == "__main__":
):- Calls
run_command_capture_text
foruptime
andvmstat
, expecting success (check=True
). - Calls the respective parsing function if the command execution yielded standard output.
- Prints the extracted data in a user-friendly format.
- Calls
run_command_capture_text
for the failingls
command withcheck=False
to demonstrate how the helper function reports the error without crashing the script.
- Calls
Conclusion: You have created a script that effectively utilizes the subprocess
module to run external Linux commands (uptime
, vmstat
), capture their output, and perform basic parsing using string methods and regular expressions to extract meaningful system information. You also practiced creating a reusable helper function for running commands and handling common execution errors. This is a practical example of how Python can be used for system monitoring and automation on Linux.
11. Networking Basics with Python Sockets and Requests
Python provides excellent support for network programming, ranging from low-level socket programming for custom protocols or high-performance applications, to high-level libraries for common protocols like HTTP.
Basic Networking Concepts (Brief Review)
- IP Address: A unique numerical label assigned to each device connected to a computer network that uses the Internet Protocol for communication (e.g.,
192.168.1.10
,203.0.113.5
). IPv4 (32-bit) and IPv6 (128-bit) are the two main versions.127.0.0.1
(IPv4) or::1
(IPv6) is the loopback address, always referring to the local machine (localhost). - Port: A number (0-65535) used to identify a specific process or service endpoint on a host. Allows multiple network applications to run on the same IP address. Well-known ports (0-1023) are reserved for standard services (e.g., 80 for HTTP, 443 for HTTPS, 22 for SSH, 25 for SMTP). Registered ports (1024-49151) and dynamic/private ports (49152-65535) are also used. Accessing ports below 1024 typically requires root privileges.
- TCP (Transmission Control Protocol): A connection-oriented protocol providing reliable, ordered, and error-checked delivery of a stream of bytes. Used for HTTP, FTP, SSH, SMTP, etc. Establishes a connection (three-way handshake) before data transfer and ensures data arrives correctly and in order.
- UDP (User Datagram Protocol): A connectionless protocol providing a simple datagram (packet) service. Faster but less reliable than TCP (no guaranteed delivery, order, or extensive error checking). Used for DNS, DHCP, VoIP, online gaming, where speed is prioritized over perfect reliability for every packet.
- Sockets: An endpoint for network communication, representing one end of a connection. A socket combines an IP address and a port number. The
socket
module in Python provides the interface to the Berkeley sockets API, the standard for network communication on Unix-like systems and Windows. - Client-Server Model: A common architecture where a server process listens for incoming connections on a specific IP address and port. Client processes initiate connections to the server to request services or exchange data.
Low-Level Networking: The socket
Module
The socket
module provides the fundamental building blocks for network communication. It allows you to create sockets, bind them to addresses, listen for connections, accept connections, connect to remote servers, and send/receive raw data over TCP or UDP.
Creating a Simple TCP Echo Server: An echo server simply sends back any data it receives from a client.
# File: tcp_echo_server.py
import socket
import threading # To handle multiple clients concurrently
HOST = '127.0.0.1' # Standard loopback interface address (localhost)
PORT = 65432 # Port to listen on (use ports > 1023)
BUFFER_SIZE = 1024 # Max amount of data to receive at once
def handle_client_connection(client_socket, client_address):
"""Thread function to handle communication with a single client."""
print(f"[SERVER THREAD {client_address}] Connection established.")
try:
while True:
# Receive data from the client
# recv() is a blocking call by default
data_bytes = client_socket.recv(BUFFER_SIZE)
if not data_bytes:
# If recv returns an empty bytes object, client closed the connection
print(f"[SERVER THREAD {client_address}] Client disconnected.")
break
# Decode received bytes into a string (assuming UTF-8)
message = data_bytes.decode('utf-8')
print(f"[SERVER THREAD {client_address}] Received: '{message}'")
# Prepare the response (echo)
response_message = f"Echo: {message}"
response_bytes = response_message.encode('utf-8')
# Send the response back to the client
client_socket.sendall(response_bytes) # sendall handles sending all data
print(f"[SERVER THREAD {client_address}] Sent: '{response_message}'")
except ConnectionResetError:
print(f"[SERVER THREAD {client_address}] Client connection reset forcibly.")
except socket.error as sock_err:
print(f"[SERVER THREAD {client_address}] Socket error: {sock_err}")
except Exception as e:
print(f"[SERVER THREAD {client_address}] Unexpected error: {e}")
finally:
# Clean up the connection for this client
print(f"[SERVER THREAD {client_address}] Closing connection.")
client_socket.close()
def start_echo_server(host, port):
"""Starts the TCP Echo server."""
# 1. Create the server socket
# socket.AF_INET: Use IPv4 addresses
# socket.SOCK_STREAM: Use TCP protocol
server_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
print("[SERVER] Socket created.")
try:
# 2. Allow reusing the address (optional, but helpful for development)
# Prevents "Address already in use" error on quick restarts
server_socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
# 3. Bind the socket to the host and port
server_socket.bind((host, port))
print(f"[SERVER] Socket bound to {host}:{port}")
# 4. Enable the server to accept connections
# The argument (e.g., 5) is the backlog - max number of queued connections
server_socket.listen(5)
print("[SERVER] Listening for incoming connections...")
# 5. Main loop to accept client connections
while True:
try:
# Accept waits for an incoming connection (blocking)
# Returns a *new* socket object for the connection and the client's address
client_conn_socket, client_addr = server_socket.accept()
# Create and start a new thread to handle this client's communication
# This allows the server to handle multiple clients without blocking
client_handler_thread = threading.Thread(
target=handle_client_connection,
args=(client_conn_socket, client_addr) # Pass socket and address to thread
)
client_handler_thread.daemon = True # Allow program to exit even if threads are running
client_handler_thread.start()
print(f"[SERVER MAIN] Accepted connection from {client_addr}. Started handler thread.")
print(f"[SERVER MAIN] Current active threads (approx): {threading.active_count() - 1}")
except Exception as accept_err:
print(f"[SERVER MAIN] Error accepting connection: {accept_err}")
# Decide if the server should continue or stop on accept errors
except KeyboardInterrupt: # Handle Ctrl+C gracefully
print("\n[SERVER] Shutdown signal received.")
except socket.error as bind_err:
print(f"[SERVER] Failed to bind or listen on {host}:{port}. Error: {bind_err}")
except Exception as e:
print(f"[SERVER] An unexpected server error occurred: {e}")
finally:
# Ensure the main server socket is closed on exit
print("[SERVER] Closing server socket.")
server_socket.close()
if __name__ == "__main__":
start_echo_server(HOST, PORT)
Creating a TCP Echo Client:
# File: tcp_echo_client.py
import socket
import sys
# Should match the server's details
SERVER_HOST = '127.0.0.1'
SERVER_PORT = 65432
BUFFER_SIZE = 1024
def run_echo_client():
"""Runs the TCP Echo client."""
# Create a TCP/IP socket (must match server type: AF_INET, SOCK_STREAM)
# Using 'with' ensures the socket is closed automatically
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as client_socket:
try:
# 1. Connect the socket to the server's address and port
print(f"[CLIENT] Attempting connection to {SERVER_HOST}:{SERVER_PORT}...")
client_socket.connect((SERVER_HOST, SERVER_PORT))
print("[CLIENT] Connection successful!")
# 2. Loop for sending messages
while True:
# Get message from user input
try:
message_to_send = input("Enter message ('quit' to exit): ")
except EOFError: # Handle Ctrl+D
print("\n[CLIENT] EOF received, quitting.")
break
if message_to_send.lower() == 'quit':
break
if not message_to_send:
print("[CLIENT] Sending empty message.") # Send empty to test server handling
# 3. Send data (encode string to bytes)
print(f"[CLIENT] Sending: '{message_to_send}'")
client_socket.sendall(message_to_send.encode('utf-8'))
# 4. Receive response data from the server (blocking)
response_bytes = client_socket.recv(BUFFER_SIZE)
if not response_bytes:
# Server closed connection if recv returns empty bytes
print("[CLIENT] Server closed the connection.")
break
response_message = response_bytes.decode('utf-8')
print(f"[CLIENT] Received: '{response_message}'")
except ConnectionRefusedError:
print(f"[CLIENT] Error: Connection refused. Is the server running at {SERVER_HOST}:{SERVER_PORT}?")
except socket.timeout:
print("[CLIENT] Error: Connection attempt timed out.")
except socket.error as sock_err:
print(f"[CLIENT] Socket error occurred: {sock_err}")
except KeyboardInterrupt:
print("\n[CLIENT] Interrupted by user.")
except Exception as e:
print(f"[CLIENT] An unexpected error occurred: {e}")
finally:
# The 'with' statement handles closing, but a message is nice
print("[CLIENT] Connection closed.")
if __name__ == "__main__":
run_echo_client()
Running the Echo Client/Server:
- Open one terminal, navigate to the project directory, activate venv, and run the server:
python tcp_echo_server.py
- Open a second terminal, navigate to the same directory, activate venv, and run the client:
python tcp_echo_client.py
- Type messages in the client terminal and press Enter. Observe the echo response.
- Observe the logs in both terminals showing connection, send/receive actions, and threading on the server.
- Type
quit
in the client to disconnect. - Run multiple clients to see the server handle them concurrently via threads.
- Press
Ctrl+C
in the server terminal to shut it down gracefully.
This demonstrates the fundamentals of socket programming: creating, binding, listening, accepting, connecting, sending, and receiving data.
High-Level HTTP Client: The requests
Library
While sockets handle raw network communication, interacting with web services usually involves the HTTP protocol. Python's standard library has http.client
and urllib.request
, but the third-party requests
library is overwhelmingly preferred for its simplicity, elegance, and robustness. It makes sending HTTP requests incredibly easy.
Installation (within your virtual environment):
Common requests
Usage:
# File: requests_example.py
import requests
import json # To handle JSON responses easily
# --- Simple GET Request ---
print("--- Making GET request to JSONPlaceholder (fetch post #1) ---")
post_id = 1
url_get = f'https://jsonplaceholder.typicode.com/posts/{post_id}'
try:
# Make the GET request, include a timeout
response = requests.get(url_get, timeout=10) # Timeout in seconds
# Check if the request was successful (status code 2xx)
# raise_for_status() will raise an HTTPError for 4xx/5xx responses
response.raise_for_status()
print(f"GET Request to {url_get} OK (Status Code: {response.status_code})")
# print(f"Response Headers:\n{response.headers}") # See all headers
# Process the response body
# .text gives the body as a decoded string (requests guesses encoding)
# print(f"\nResponse Body (text):\n{response.text[:200]}...") # Print first 200 chars
# If expecting JSON, use .json() to parse directly into Python dict/list
# This raises requests.exceptions.JSONDecodeError if parsing fails
data = response.json()
print("\nParsed JSON Response:")
print(json.dumps(data, indent=2)) # Pretty-print the dictionary
print(f"\nPost Title: {data.get('title', 'N/A')}") # Safely access title
except requests.exceptions.Timeout:
print(f"Error: Request to {url_get} timed out.")
except requests.exceptions.ConnectionError:
print(f"Error: Could not connect to {url_get}. Check network or hostname.")
except requests.exceptions.HTTPError as http_err:
# Handles 4xx (client errors) and 5xx (server errors)
print(f"HTTP Error occurred: {http_err}")
print(f"Status Code: {http_err.response.status_code}")
# print(f"Response Body: {http_err.response.text}") # Show error response body if any
except requests.exceptions.JSONDecodeError:
print(f"Error: Failed to decode JSON response from {url_get}")
print(f"Raw response text: {response.text}") # Show the raw text
except requests.exceptions.RequestException as req_err:
# A base class for other requests errors (URL issues, etc.)
print(f"A requests error occurred: {req_err}")
except Exception as e:
print(f"An unexpected error occurred: {e}")
# --- POST Request with JSON Payload ---
print("\n\n--- Making POST request (create a new post) ---")
url_post = 'https://jsonplaceholder.typicode.com/posts'
# Data to send - requests can automatically encode this dict to JSON
new_post_data = {
'title': 'Linux Automation with Python',
'body': 'Using requests library is efficient!',
'userId': 5 # Example user ID
}
# Optional: Custom headers
custom_headers = {
'User-Agent': 'MyPythonScript/2.0',
'Accept': 'application/json' # Indicate we prefer JSON response
}
try:
# Use requests.post()
# Use 'json=' parameter to send Python dict as JSON body
# Requests automatically sets 'Content-Type: application/json'
response_post = requests.post(url_post, headers=custom_headers, json=new_post_data, timeout=10)
response_post.raise_for_status() # Check for HTTP errors (e.g., 400 Bad Request)
print(f"POST Request to {url_post} OK (Status Code: {response_post.status_code})") # Expect 201 Created
# API often returns the created resource representation
created_post = response_post.json()
print("\nResponse Body (Created Post):")
print(json.dumps(created_post, indent=2))
print(f"\nAssigned ID for new post: {created_post.get('id')}")
except requests.exceptions.RequestException as req_err:
print(f"An error occurred during the POST request: {req_err}")
except Exception as e:
print(f"An unexpected error occurred during POST: {e}")
Key requests
Advantages:
- Simple API for all common HTTP methods (GET, POST, PUT, DELETE, etc.).
- Automatic content decoding (usually correct).
- Easy handling of JSON request/response bodies (
json=
parameter,.json()
method). - Support for URL parameters (
params=...
), form data (data=...
), file uploads (files=...
). - Automatic handling of cookies and sessions (
requests.Session
). - Built-in SSL verification (secure by default).
- Clear exception hierarchy for network/HTTP errors.
requests
significantly simplifies web interactions compared to manual socket programming for HTTP. Use socket
for non-HTTP protocols, low-level control, or specific performance needs. Use requests
for almost all interactions with web APIs and websites.
Workshop Networking Basics
This workshop combines the two approaches: First, run the simple TCP echo client/server created earlier to experience socket-level communication. Second, write a script using the requests
library to fetch public IP address information from a web API.
Goal: Understand the difference between low-level socket communication and high-level HTTP requests using a library. Practice running a simple socket-based client/server and using requests
to interact with a real-world web API.
Part 1: Running the TCP Echo Client/Server
Steps:
- Locate Files: Ensure you have the
tcp_echo_server.py
andtcp_echo_client.py
files created in the theoretical section above. - Run Server: Open a terminal, navigate to the directory containing the files, activate your virtual environment (if used), and run the server: Observe the output indicating it's bound and listening.
- Run Client: Open a second terminal, navigate to the same directory, activate the virtual environment, and run the client: Observe the "Connection successful!" message.
- Interact:
- In the client terminal, type messages like "Hello Server", "Test 123", an empty line, etc., and press Enter after each.
- Observe the messages being sent and the echoed responses received in the client terminal.
- Observe the connection logs, received messages, sent responses, and thread activity logged in the server terminal.
- Disconnect: Type
quit
in the client terminal (or pressCtrl+D
). Observe the disconnection messages in both terminals. - Shutdown Server: Press
Ctrl+C
in the server terminal. Observe the shutdown messages.
Conclusion (Part 1): You have successfully run a basic client-server application using Python's socket
module, demonstrating direct, low-level TCP stream communication. You saw how data is sent as bytes and requires manual encoding/decoding, and how threading can be used for concurrency on the server.
Part 2: Fetching Public IP Info with requests
Goal: Write a script that uses the requests
library to query a public API (like ipinfo.io
or ip-api.com
) to find your public IP address and associated geolocation information.
Steps:
-
Set Up:
- Navigate to your project directory.
- Ensure your virtual environment is active.
- Install
requests
if you haven't already:pip install requests
- Create a new Python file named
ip_info_fetcher.py
(nano ip_info_fetcher.py
).
-
Write the Code: Enter the following Python code. We'll use
ip-api.com
as it offers a simple JSON endpoint without requiring an API key for basic use.# File: ip_info_fetcher.py """ Fetches public IP address and geolocation information using requests and the ip-api.com service. """ import requests import json # For pretty printing the output # API endpoint URL - queries info for the IP address making the request IP_API_URL = "http://ip-api.com/json/" # Note: HTTP, not HTTPS for this free endpoint def get_public_ip_info(): """ Fetches public IP information from ip-api.com. Returns: dict or None: A dictionary containing IP info if successful, None otherwise. """ print(f"Querying API endpoint: {IP_API_URL}") try: # Set a reasonable timeout response = requests.get(IP_API_URL, timeout=10) # Check for HTTP errors (4xx, 5xx) response.raise_for_status() # Parse the JSON response ip_data = response.json() # Check the status field within the JSON response from ip-api if ip_data.get("status") == "success": return ip_data else: error_message = ip_data.get("message", "Unknown error from API") print(f"Error: API reported failure - {error_message}") return None except requests.exceptions.Timeout: print("Error: Request to IP API timed out.") return None except requests.exceptions.ConnectionError: print("Error: Could not connect to IP API. Check network.") return None except requests.exceptions.HTTPError as http_err: print(f"HTTP Error occurred: {http_err}") return None except requests.exceptions.JSONDecodeError: print("Error: Failed to decode JSON response from IP API.") return None except requests.exceptions.RequestException as req_err: print(f"An error occurred during the request: {req_err}") return None except Exception as e: print(f"An unexpected error occurred: {e}") return None def display_ip_info(info_dict): """Prints selected IP information nicely.""" if not info_dict: print("No IP information to display.") return print("\n--- Public IP Information ---") # Use .get() for safe access, providing 'N/A' as default print(f" IP Address: {info_dict.get('query', 'N/A')}") print(f" Country: {info_dict.get('country', 'N/A')} ({info_dict.get('countryCode', 'N/A')})") print(f" Region: {info_dict.get('regionName', 'N/A')} ({info_dict.get('region', 'N/A')})") print(f" City: {info_dict.get('city', 'N/A')}") print(f" ZIP Code: {info_dict.get('zip', 'N/A')}") print(f" Latitude: {info_dict.get('lat', 'N/A')}") print(f" Longitude: {info_dict.get('lon', 'N/A')}") print(f" ISP: {info_dict.get('isp', 'N/A')}") print(f" Org: {info_dict.get('org', 'N/A')}") print(f" AS Info: {info_dict.get('as', 'N/A')}") print("-----------------------------") # Uncomment below to see all fields returned by the API # print("\nRaw API Response:") # print(json.dumps(info_dict, indent=2)) # --- Main Execution --- if __name__ == "__main__": print("Fetching your public IP information...") ip_information = get_public_ip_info() if ip_information: display_ip_info(ip_information) else: print("Failed to retrieve IP information.")
-
Save and Exit: Save the
ip_info_fetcher.py
file. -
Run the Script:
- Execute the script from your terminal (ensure venv active and
requests
installed): - Observe:
- The script will print the API URL it's contacting.
- If successful, it will display your public IP address and associated geolocation data (country, region, city, ISP, etc.) obtained from the API response. The accuracy of geolocation can vary.
- If there's a network issue, timeout, or an error from the API, the script should print an informative error message thanks to the
try...except
blocks.
- Execute the script from your terminal (ensure venv active and
Code Explanation Recap:
requests.get(URL, timeout=...)
: Made a simple HTTP GET request to the API endpoint.response.raise_for_status()
: Checked for HTTP-level errors (like 404 Not Found, 500 Server Error).response.json()
: Parsed the JSON response from the API directly into a Python dictionary (ip_data
).- API Specific Check: Checked the
"status"
field within the returned JSON, as required by theip-api.com
service, to ensure the API call itself was successful. - Error Handling: Used specific
requests.exceptions
(Timeout, ConnectionError, HTTPError, JSONDecodeError, RequestException) to catch and report different kinds of network or API problems gracefully. .get(key, default)
: Used dictionary's.get()
method to safely access fields in the response dictionary, preventingKeyError
if a field is unexpectedly missing.
Conclusion (Part 2): You have successfully used the high-level requests
library to interact with a public web API. You fetched data, handled the JSON response, and implemented robust error checking. Compare the simplicity of this code for making an HTTP request to the complexity of the raw socket code in Part 1 – requests
makes working with web services much easier.
12. Introduction to Python Web Frameworks Flask or Django
While Python can handle low-level networking with socket
and make HTTP requests with requests
, building complete web applications (websites, APIs) typically involves using a web framework. Frameworks provide structure, tools, and abstractions to handle common web development tasks like routing requests, handling user sessions, interacting with databases, and rendering HTML templates, allowing developers to focus on application logic.
Two of the most popular Python web frameworks are Flask and Django.
Core Concepts of Web Frameworks
- Request-Response Cycle: The fundamental model of web interaction. A client (web browser) sends an HTTP Request to a server. The server (running your web application via the framework) processes the request and sends back an HTTP Response.
- Routing: Mapping incoming URL paths (e.g.,
/home
,/users/profile
) to specific Python functions (often called view functions or handlers) that will process the request for that URL. - Request Object: The framework typically encapsulates incoming request data (headers, query parameters, form data, request body, cookies) into an easy-to-use request object available within your view functions.
- Response Object: Your view function creates and returns a response, often by rendering a template or returning data (like JSON). The framework converts this into a proper HTTP Response object with headers, status code, and body.
- Templates: Files (usually HTML, but can be others) containing placeholders for dynamic data. A template engine fills these placeholders with actual values from your application before sending the HTML to the client. This separates presentation logic (HTML structure) from application logic (Python code). Common engines include Jinja2 (used by Flask) and Django's own template engine.
- ORM (Object-Relational Mapper): A library that abstracts database interactions. It allows you to define database table structures as Python classes (models) and interact with the database using Python objects and methods, rather than writing raw SQL queries. Django has a powerful built-in ORM; Flask typically integrates with external ORMs like SQLAlchemy.
- MVC (Model-View-Controller) / MVT (Model-View-Template): Architectural patterns describing how to organize code:
- Model: Represents the application's data structure (often database tables via an ORM). Handles data logic and persistence.
- View: (In Django, the 'View' is more like the Controller) The logic that handles incoming requests, interacts with the Model to get data, decides what response to send, and often selects a Template. (In Flask, the view function directly handles this).
- Controller: (In classic MVC, this is the View's role in Django/Flask) Receives input, interacts with Model/View.
- Template: (Django's 'T' in MVT) Handles the presentation logic (HTML rendering). Flask uses templates but isn't strictly MVT.
Flask: The Microframework
- Philosophy: Flask is a "microframework". It provides the core essentials for web development (routing, request/response handling, template rendering via Jinja2) but stays out of your way for other decisions. It doesn't mandate a specific project structure, database layer, or authentication system.
- Simplicity: Known for its simple API and minimal boilerplate code, making it easy to get started with small to medium-sized applications or APIs.
- Flexibility: You choose and integrate the extensions or libraries you need for databases (SQLAlchemy, Peewee), forms (WTForms), authentication, etc. This gives you more control but requires more setup for larger projects.
- Explicit: Flask tends to be more explicit in its design.
- Use Cases: Excellent for REST APIs, small web applications, prototypes, or projects where you want fine-grained control over the components used.
Simple Flask Example:
# File: simple_flask_app.py
from flask import Flask, request, render_template_string, redirect, url_for, abort
# Create a Flask application instance
# __name__ helps Flask find templates/static files relative to this script
app = Flask(__name__)
# --- In-memory "database" (for simplicity) ---
posts = {
1: {'title': 'First Post', 'content': 'This is the content of the first post.'},
2: {'title': 'Second Post', 'content': 'Another interesting post here.'}
}
next_post_id = 3
# --- Routes and View Functions ---
# Route for the homepage ('/')
@app.route('/')
def index():
"""View function for the homepage."""
# Simple HTML string response using Jinja2 templating within the string
# In real apps, use render_template('index.html') and put HTML in a 'templates' folder
page_title = "Welcome to SimpleBlog"
# url_for() generates URLs based on the function name, safer than hardcoding
# Pass data (title, posts dictionary) to the template context
return render_template_string("""
<!DOCTYPE html>
<html><head><title>{{ title }}</title></head>
<body>
<h1>{{ title }}</h1>
<p>Welcome to our simple blog powered by Flask!</p>
<h2>Posts</h2>
<ul>
{% for post_id, post_data in posts.items() %}
<li><a href="{{ url_for('show_post', post_id=post_id) }}">{{ post_data.title }}</a></li>
{% else %}
<li>No posts yet!</li>
{% endfor %}
</ul>
<hr>
<p><a href="{{ url_for('new_post') }}">Create New Post</a></p>
</body></html>
""", title=page_title, posts=posts) # Pass data to the template
# Route for displaying a single post (dynamic route)
# <int:post_id> captures the integer value from the URL path and passes it to the function
@app.route('/post/<int:post_id>')
def show_post(post_id):
"""View function to display a specific post."""
post = posts.get(post_id) # Safely get post from dict
if not post:
# Use Flask's abort() helper to return a standard 404 Not Found response
abort(404, description="Post Not Found")
# Render the post details using another template string
return render_template_string("""
<!DOCTYPE html>
<html><head><title>{{ post.title }}</title></head>
<body>
<h1>{{ post.title }}</h1>
<p>{{ post.content }}</p>
<hr>
<a href="{{ url_for('index') }}">Back to Home</a>
</body></html>
""", post=post)
# Route for creating a new post (handles GET and POST requests)
@app.route('/post/new', methods=['GET', 'POST'])
def new_post():
"""View function to display form (GET) or handle form submission (POST)."""
global next_post_id # Allow modifying the global variable (not ideal in real apps!)
if request.method == 'POST':
# Handle the submitted form data
# request.form is a dictionary-like object holding POST form data
title = request.form.get('title')
content = request.form.get('content')
# Basic server-side validation
if not title or not content:
# Basic error feedback (in real app, re-render form with errors)
return "Error: Title and Content are required!", 400
# Add the new post to our in-memory "database"
new_id = next_post_id
posts[new_id] = {'title': title, 'content': content}
next_post_id += 1
print(f"New post created: ID={new_id}, Title='{title}'") # Server log
# Redirect the user to the newly created post's page using url_for
# redirect() creates a 302 Found response
return redirect(url_for('show_post', post_id=new_id))
else: # request.method == 'GET'
# Display the HTML form to create a new post
return render_template_string("""
<!DOCTYPE html>
<html><head><title>Create New Post</title></head>
<body>
<h1>Create New Post</h1>
<form method="POST">
<label for="title">Title:</label><br>
<input type="text" id="title" name="title" size="50" required><br><br>
<label for="content">Content:</label><br>
<textarea id="content" name="content" rows="5" cols="50" required></textarea><br><br>
<input type="submit" value="Create Post">
</form>
<hr>
<a href="{{ url_for('index') }}">Cancel</a>
</body></html>
""")
# --- Run the Application ---
if __name__ == '__main__':
# Runs the built-in Flask development server
# debug=True enables auto-reloading on code changes and provides detailed error pages
# DO NOT use debug=True in production environments! Use a production WSGI server like Gunicorn or uWSGI.
print("Starting Flask development server...")
print("Access at: http://127.0.0.1:5000/")
# host='0.0.0.0' makes it accessible from other devices on your network
app.run(host='0.0.0.0', port=5000, debug=True)
Django: The "Batteries-Included" Framework
- Philosophy: Django takes an "everything you need" or "batteries-included" approach. It bundles a vast amount of functionality needed for complex web applications directly into the core framework: a powerful ORM, an automatic admin interface, an authentication system, its own template engine, form handling libraries, security middleware (CSRF protection, XSS filtering, etc.), caching frameworks, and more.
- Convention over Configuration: Django emphasizes following established conventions for project structure, naming, and workflow. This can speed up development significantly once learned, as many decisions are pre-made, and it promotes consistency across projects.
- Scalability: Built with large-scale applications in mind, providing tools and patterns that support high traffic and complex data models. It powers major websites like Instagram, Pinterest, and Disqus.
- ORM (Object-Relational Mapper): Django's built-in ORM is a cornerstone feature. It allows developers to define database models as Python classes and interact with the database using Pythonic queries, largely abstracting away raw SQL.
- Admin Interface: Django can automatically generate a fully functional, production-ready web interface for administrators to manage the application's data models (Create, Read, Update, Delete operations). This is a massive time-saver for many projects.
- Use Cases: Ideal for content management systems (CMS), social networks, e-commerce platforms, scientific platforms, and generally larger, database-centric web applications that require user authentication, an admin backend, and robust features out-of-the-box.
Conceptual Django Example Structure (Not runnable code, illustrates concepts):
- Start Project:
django-admin startproject mydjangosite
creates the main project directory and configuration files.mydjangosite/ manage.py # Command-line utility (run server, migrations, etc.) mydjangosite/ # Project Python package __init__.py settings.py # Database, installed apps, middleware, templates config urls.py # Main URL routing configuration (maps paths to apps/views) asgi.py # ASGI configuration for async servers wsgi.py # WSGI configuration for traditional servers
- Start App:
cd mydjangosite
thenpython manage.py startapp blog
creates a reusable application directory for blog-related features.mydjangosite/ ... (project files) ... blog/ # The 'blog' application directory __init__.py admin.py # Register models to appear in the admin site apps.py # Application configuration migrations/ # Directory to store database schema changes __init__.py models.py # Define database models using Django ORM classes tests.py # Application-specific tests urls.py # (Optional) Application-specific URL routing views.py # Define view functions or classes (request handlers) templates/ # (Convention) Directory for HTML templates blog/ # Namespace templates by app name post_list.html post_detail.html
- Define Model (
blog/models.py
): Map Python classes to database tables.from django.db import models from django.utils import timezone from django.contrib.auth.models import User # Example: Link to User model class Post(models.Model): title = models.CharField(max_length=250) content = models.TextField() published_date = models.DateTimeField(default=timezone.now) author = models.ForeignKey(User, on_delete=models.CASCADE) # Link to built-in User def __str__(self): # String representation in admin/shell return self.title
- Register App: Add
'blog.apps.BlogConfig'
toINSTALLED_APPS
inmydjangosite/settings.py
. - Run Migrations:
python manage.py makemigrations blog
(creates migration files based on model changes) andpython manage.py migrate
(applies migrations to the database, creating/altering tables). - Define Views (
blog/views.py
): Handle requests and interact with models.from django.shortcuts import render, get_object_or_404 from .models import Post def post_list(request): # Request object contains request details posts = Post.objects.filter(published_date__isnull=False).order_by('-published_date') # Renders template with context data return render(request, 'blog/post_list.html', {'posts': posts}) def post_detail(request, pk): # Primary key (pk) captured from URL post = get_object_or_404(Post, pk=pk) # Get Post or raise 404 return render(request, 'blog/post_detail.html', {'post': post})
- Define URLs (
blog/urls.py
- App level): Map URL patterns within the app to views. - Include App URLs in Project (
mydjangosite/urls.py
): Delegate URL patterns starting withblog/
to theblog
app'surls.py
. - Create Templates (
blog/templates/blog/post_list.html
, etc.): Use Django's template language (variables{{ post.title }}
, tags{% for post in posts %}
) to display data from the context passed by the views. - Run Development Server:
python manage.py runserver
.
This structure highlights Django's more comprehensive, "batteries-included", and convention-driven nature compared to Flask.
Flask vs. Django: Which to Choose?
- Choose Flask if:
- You are building smaller applications, microservices, or APIs.
- You value simplicity and want maximum flexibility in choosing components (database layer, forms library, etc.).
- You prefer explicit configuration over convention.
- The learning curve for the core framework seems less steep.
- Choose Django if:
- You are building larger, complex, database-driven applications (CMS, social networks, e-commerce).
- You want a feature-rich framework with ORM, admin, auth, security, etc., built-in and well-integrated.
- You value rapid development through convention and built-in tools.
- A clear, enforceable project structure is beneficial for team collaboration.
- Scalability and security features are primary concerns.
Both are mature, powerful frameworks with excellent documentation and strong community support. The "better" choice depends entirely on the project's specific needs and the developer's preferences. Often, starting with Flask for smaller projects and moving to Django for larger ones is a common path, but both can handle a wide range of application sizes.
Workshop Introduction to Web Frameworks (Flask)
This workshop provides a hands-on introduction to Flask by building a very simple web application that displays basic Linux system information (CPU Model and Memory) fetched using functions similar to those in the File I/O workshop.
Goal: Learn the basics of Flask: creating an app, defining routes using decorators, writing view functions that return HTML, passing dynamic data to templates (using simple string formatting here), and running the Flask development server.
Project: Basic System Info Web Viewer
Steps:
-
Set Up Project Directory and Environment:
- Open your terminal.
- Create a new project directory:
mkdir flask_sysinfo
- Navigate into it:
cd flask_sysinfo
- Create and activate a virtual environment:
- Install Flask:
- Freeze dependencies:
-
Create Utility Functions (Adapting from previous workshop):
- Create a file named
sysinfo_utils.py
:nano sysinfo_utils.py
- Add the following parsing functions (simplified/adapted from the File I/O workshop example):
# File: sysinfo_utils.py """Utility functions to get basic system info from /proc.""" PROC_CPUINFO_PATH = '/proc/cpuinfo' PROC_MEMINFO_PATH = '/proc/meminfo' def get_cpu_model_name(): """Parses /proc/cpuinfo to extract CPU model name.""" model_name = "N/A" # Default value try: with open(PROC_CPUINFO_PATH, 'r', encoding='utf-8') as f: for line in f: if line.strip().startswith('model name'): # Take the first model name found model_name = line.split(':', 1)[1].strip() break # Stop after finding the first one except (FileNotFoundError, PermissionError, IndexError) as e: print(f"Warning: Could not read/parse CPU info from {PROC_CPUINFO_PATH}: {e}") except Exception as e: print(f"Warning: Unexpected error reading CPU info: {e}") return model_name def get_memory_info_kib(): """ Parses /proc/meminfo for Total and Available memory (returns KiB). Returns: tuple: (total_kib: int or 'N/A', available_kib: int or 'N/A') """ mem_total_kib = 'N/A' mem_avail_kib = 'N/A' keys_map = {'MemTotal': 'total', 'MemAvailable': 'available'} found = {'total': False, 'available': False} try: with open(PROC_MEMINFO_PATH, 'r', encoding='utf-8') as f: for line in f: parts = line.strip().split(':', 1) if len(parts) == 2: key = parts[0] if key == 'MemTotal' and not found['total']: try: mem_total_kib = int(parts[1].strip().split()[0]) found['total'] = True except (ValueError, IndexError): pass elif key == 'MemAvailable' and not found['available']: try: mem_avail_kib = int(parts[1].strip().split()[0]) found['available'] = True except (ValueError, IndexError): pass # Optimization: Stop if we found both needed keys if found['total'] and found['available']: break # Basic fallback if MemAvailable wasn't found (simplified) if not found['available']: print("Warning: 'MemAvailable' not found. Check /proc/meminfo format.") # Could add MemFree fallback here if desired except (FileNotFoundError, PermissionError) as e: print(f"Error: Could not read {PROC_MEMINFO_PATH}: {e}") except Exception as e: print(f"Warning: Unexpected error reading memory info: {e}") return mem_total_kib, mem_avail_kib
- Save and exit.
- Create a file named
-
Create the Flask Application (
app.py
):- Create the main application file:
nano app.py
- Add the following Flask code:
# File: app.py """ A simple Flask web application to display basic system information. """ # Import Flask class and render_template_string function from flask import Flask, render_template_string # Import our utility functions from the sibling file from sysinfo_utils import get_cpu_model_name, get_memory_info_kib # Create the Flask application instance # __name__ tells Flask where to look for resources like templates/static files app = Flask(__name__) # --- Route Definition --- # Use the @app.route() decorator to bind a URL path ('/') to a function @app.route('/') def system_info_page(): """View function executed when a request comes to the root URL ('/').""" print("Request received for system info page...") # Log to console # --- Gather Data --- # Call our utility functions to get system info cpu_model = get_cpu_model_name() mem_total_kib, mem_avail_kib = get_memory_info_kib() # Optional: Convert KiB to MiB/GiB for display (handle potential 'N/A') def format_mem(kib_val): if isinstance(kib_val, int): if kib_val >= 1024 * 1024: # If >= 1 GiB return f"{kib_val / (1024**2):.2f} GiB" elif kib_val >= 1024: # If >= 1 MiB return f"{kib_val / 1024:.1f} MiB" else: return f"{kib_val} KiB" return "N/A" # Return N/A if input wasn't an int mem_total_str = format_mem(mem_total_kib) mem_avail_str = format_mem(mem_avail_kib) # --- Prepare HTML Template (using render_template_string) --- # This approach embeds HTML directly in the Python file. # For real apps, create a 'templates' folder and use render_template('my_template.html') html_template = """ <!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <title>Linux System Info</title> <style> body { font-family: sans-serif; margin: 2em; background-color: #f8f8f8; color: #333; } h1 { color: #0056b3; border-bottom: 2px solid #ccc; padding-bottom: 10px; } table { border-collapse: collapse; width: 500px; margin-top: 20px; background-color: #fff; box-shadow: 0 2px 4px rgba(0,0,0,0.1); } th, td { border: 1px solid #e1e1e1; padding: 10px; text-align: left; } th { background-color: #e9ecef; font-weight: bold; } tr:nth-child(even) { background-color: #f9f9f9; } </style> </head> <body> <h1>Basic Linux System Information</h1> <table> <tr> <th>Metric</th> <th>Value</th> </tr> <tr> <td>CPU Model</td> <td>{{ cpu_model_name }}</td> </tr> <tr> <td>Total Memory</td> <!-- Display original KiB and formatted string --> <td>{{ total_kib }} KiB ({{ total_mem_str }})</td> </tr> <tr> <td>Available Memory</td> <td>{{ avail_kib }} KiB ({{ avail_mem_str }})</td> </tr> </table> <footer> <p><small>Data fetched dynamically via Python/Flask.</small></p> </footer> </body> </html> """ # --- Render and Return Response --- # render_template_string processes the template, substituting placeholders. # Pass the collected data variables to the template context using keyword arguments. return render_template_string( html_template, cpu_model_name=cpu_model, total_kib=mem_total_kib, avail_kib=mem_avail_kib, total_mem_str=mem_total_str, avail_mem_str=mem_avail_str ) # --- Run the Application --- if __name__ == '__main__': # Start the built-in Flask development server # host='0.0.0.0' makes the server accessible from other devices on the network # debug=True enables auto-reloading and interactive debugger (DO NOT USE IN PRODUCTION) print("Starting Flask development server...") print("Access the app in your browser:") print(" - http://127.0.0.1:5000/") print(" - http://<your-linux-ip>:5000/ (if accessing from another device)") # The run command blocks until you stop it (Ctrl+C) app.run(host='0.0.0.0', port=5000, debug=True)
- Save and exit.
- Create the main application file:
-
Run the Flask Application:
- In your terminal (in the
flask_sysinfo
directory with the virtual environment active), run: - Observe: The terminal will show output indicating the Flask server is running, similar to:
* Serving Flask app 'app' * Debug mode: on WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead. * Running on all addresses (0.0.0.0) * Running on http://127.0.0.1:5000 * Running on http://<your_actual_ip>:5000 Press CTRL+C to quit * Restarting with stat * Debugger is active! * Debugger PIN: ... Starting Flask development server... Access the app in your browser: - http://127.0.0.1:5000/ - http://<your-linux-ip>:5000/ (if accessing from another device)
- In your terminal (in the
-
Access the Web Page:
- Open a web browser (like Firefox or Chrome) on your Linux machine.
- Navigate to the address
http://127.0.0.1:5000/
(orhttp://localhost:5000/
). - Observe: You should see a web page displaying a table with the CPU model name and memory information fetched from your
/proc
filesystem. The data is dynamically inserted into the HTML template by Flask. - Check the terminal where
python app.py
is running. You should see log messages indicating requests being received (e.g.,127.0.0.1 - - [Date Time] "GET / HTTP/1.1" 200 -
).
-
Stop the Server: Go back to the terminal where the Flask app is running and press
Ctrl+C
.
Code Explanation Recap:
from flask import Flask, render_template_string
: Imported the necessary Flask components.from sysinfo_utils import ...
: Imported the functions we created to get system data.app = Flask(__name__)
: Created an instance of the Flask application.@app.route('/')
: This is a decorator. It registers the function immediately following it (system_info_page
) as the handler for requests to the root URL (/
).system_info_page()
: This is the view function. Flask executes this function when a request matches the associated route.- Inside the view function:
- It called the utility functions (
get_cpu_model_name
,get_memory_info_kib
) to gather the dynamic data. - It defined an HTML structure as a multi-line string (
html_template
). This string contained Jinja2 template placeholders like{{ cpu_model_name }}
. - It called
render_template_string(html_template, **context)
. This function processes the template string, replacing the placeholders with the values of the variables passed as keyword arguments (the context). - The rendered HTML string is what Flask sends back to the browser as the HTTP response body.
- It called the utility functions (
if __name__ == '__main__':
: Standard Python entry point guard.app.run(host='0.0.0.0', port=5000, debug=True)
: Started Flask's built-in development web server.host='0.0.0.0'
made it listen on all available network interfaces.port=5000
specified the port number.debug=True
enabled useful development features (auto-reload on code changes, interactive debugger in the browser on errors) – never usedebug=True
in production.
Conclusion: You have built and run your first simple web application using the Flask microframework. You learned how to create a Flask app instance, define a route, write a view function to handle requests for that route, gather dynamic data, render an HTML response (using render_template_string
), and run the development server. This provides a basic foundation for building more complex web applications and APIs with Python on Linux.