Home

Module Management and Exception Handling

Block 2: Programming and Database Skills

Topic 2.2 · 2 Objectives

2.2.1 Import Modules and Manage Packages with PIP

Python's power is amplified by its extensive ecosystem of modules and packages. A module is a single .py file containing definitions and statements. A package is a directory of modules organized with an __init__.py file. Understanding how to import, manage, and create modules is fundamental to effective Python development.

Standard, Selective, and Aliased Imports

Python provides several import styles, each suited for different scenarios:

1. Standard Import: import module

Imports the entire module. You access its contents with dot notation.

# Standard import — import the entire module import math # Access functions via dot notation result = math.sqrt(144) # 12.0 pi_val = math.pi # 3.141592653589793 log_val = math.log(100, 10) # 2.0 import os cwd = os.getcwd() # Current working directory
Exam Tip:

With a standard import, you must always prefix function calls with the module name (e.g., math.sqrt()). Calling sqrt() alone will raise a NameError.

2. Selective Import: from module import name

Imports specific functions, classes, or variables directly into the current namespace.

# Selective import — import specific names from math import sqrt, pi, ceil result = sqrt(144) # 12.0 (no prefix needed) rounded = ceil(4.2) # 5 # Import multiple items from collections from collections import Counter, defaultdict word_counts = Counter(["apple", "banana", "apple", "cherry"]) # Counter({'apple': 2, 'banana': 1, 'cherry': 1}) # Wildcard import (generally discouraged) from math import * # Brings everything into namespace — can cause name collisions
Warning:

Avoid from module import * in production code. It pollutes the namespace and makes it unclear where names originate, which can lead to subtle bugs when two modules export identically named functions.

3. Aliased Import: import module as alias

Creates a shorter alias for frequently used modules. This is standard practice in the data science ecosystem.

# Aliased imports — data science conventions import numpy as np import pandas as pd import matplotlib.pyplot as plt import seaborn as sns # Use the alias for all calls arr = np.array([1, 2, 3, 4, 5]) df = pd.DataFrame({"col": [10, 20, 30]}) # You can also alias selective imports from datetime import datetime as dt now = dt.now()

Summary of Import Styles

Style Syntax Usage Access
Standard import math Full module needed math.sqrt()
Selective from math import sqrt Few specific names sqrt()
Aliased import numpy as np Shorten long names np.array()
Selective + Alias from datetime import datetime as dt Avoid name conflicts dt.now()
Wildcard from math import * Quick scripts only sqrt()

Python Standard Library Modules

Python ships with a rich standard library. For the exam, you should know these key modules and their purposes:

Module Purpose Key Functions / Classes
csv Read/write CSV files reader(), writer(), DictReader(), DictWriter()
os OS interaction, file paths getcwd(), listdir(), path.join(), path.exists()
math Mathematical functions sqrt(), ceil(), floor(), log(), pi
statistics Statistical calculations mean(), median(), stdev(), mode()
datetime Date/time manipulation datetime.now(), timedelta, strftime(), strptime()
collections Specialized data structures Counter, defaultdict, OrderedDict, namedtuple
json JSON encoding/decoding load(), dump(), loads(), dumps()
# csv module — reading CSV files import csv with open("sales.csv", "r") as f: reader = csv.DictReader(f) for row in reader: print(row["product"], row["revenue"]) # statistics module — quick stats import statistics data = [85, 90, 78, 92, 88, 76, 95] print(statistics.mean(data)) # 86.28571428571429 print(statistics.median(data)) # 88 print(statistics.stdev(data)) # 7.1586... # datetime module — working with dates from datetime import datetime, timedelta now = datetime.now() print(now.strftime("%Y-%m-%d %H:%M")) # 2026-04-25 14:30 one_week_ago = now - timedelta(days=7) print(one_week_ago.strftime("%B %d, %Y")) # April 18, 2026 # collections module — Counter for frequency analysis from collections import Counter, defaultdict colors = ["red", "blue", "red", "green", "blue", "red"] freq = Counter(colors) print(freq.most_common(2)) # [('red', 3), ('blue', 2)] # defaultdict provides default values for missing keys scores = defaultdict(list) scores["Alice"].append(95) scores["Bob"].append(87) scores["Alice"].append(91) print(scores) # defaultdict(list, {'Alice': [95, 91], 'Bob': [87]})

Managing Packages with PIP

PIP (Pip Installs Packages) is Python's standard package manager. It downloads and installs packages from the Python Package Index (PyPI).

Essential PIP Commands

Command Description Example
pip install Install a package pip install pandas
pip install == Install specific version pip install pandas==2.1.0
pip uninstall Remove a package pip uninstall pandas
pip list Show installed packages pip list
pip freeze Output installed packages in requirements format pip freeze > requirements.txt
pip install -r Install from requirements file pip install -r requirements.txt
pip show Display package info pip show numpy
pip install --upgrade Upgrade a package pip install --upgrade pandas
# Terminal / Command line — PIP commands # Install a package from PyPI $ pip install pandas # Install a specific version $ pip install pandas==2.1.0 # Upgrade an existing package $ pip install --upgrade pandas # Uninstall a package $ pip uninstall pandas # List all installed packages $ pip list # Package Version # ---------- ------- # numpy 1.26.4 # pandas 2.2.1 # ... # Export installed packages to a requirements file $ pip freeze > requirements.txt # Install all packages from a requirements file $ pip install -r requirements.txt # Show details about a specific package $ pip show numpy # Name: numpy # Version: 1.26.4 # Summary: Fundamental package for array computing... # Location: /usr/lib/python3/dist-packages
Key Distinction — pip list vs. pip freeze:

pip list displays packages in a human-readable table. pip freeze outputs in package==version format, ideal for generating requirements.txt files that can recreate an environment.

Creating and Importing Custom Modules

Any Python file can serve as a module. You can organize reusable code into your own modules and packages.

# File: data_utils.py (your custom module) def clean_column_name(name): """Standardize a column name to lowercase with underscores.""" return name.strip().lower().replace(" ", "_") def remove_outliers(data, threshold=3): """Remove values more than `threshold` std devs from the mean.""" import statistics mean = statistics.mean(data) stdev = statistics.stdev(data) return [x for x in data if abs(x - mean) <= threshold * stdev] VALID_FORMATS = [".csv", ".json", ".xlsx", ".parquet"]
# File: main.py (importing your custom module) # Standard import — the module file must be in the same directory # or on Python's module search path (sys.path) import data_utils clean = data_utils.clean_column_name(" Sales Revenue ") print(clean) # "sales_revenue" # Selective import from data_utils import remove_outliers, VALID_FORMATS cleaned = remove_outliers([10, 12, 11, 200, 13, 9]) print(cleaned) # [10, 12, 11, 13, 9]

Creating a Package

# Package directory structure: # my_analytics/ # __init__.py <-- makes it a package # cleaning.py # analysis.py # visualization.py # __init__.py can expose key names for convenience from .cleaning import clean_column_name from .analysis import run_summary # In another script you can then do: from my_analytics import clean_column_name # or from my_analytics.analysis import run_summary
Module Search Order:

When you import data_utils, Python searches in this order: (1) the current directory, (2) directories listed in the PYTHONPATH environment variable, (3) the standard library, (4) site-packages (where pip installs packages). You can inspect the search path with import sys; print(sys.path).

2.2.2 Exception Handling and Script Robustness

Robust data scripts anticipate and gracefully handle errors. Python's exception handling mechanism lets you catch errors at runtime and respond appropriately, rather than letting the entire program crash.

try / except / else / finally Blocks

The full exception handling structure has four clauses:

Clause When It Runs Required?
try Code that might raise an exception Yes
except Runs if the specified exception occurs Yes (at least one)
else Runs only if no exception was raised No
finally Always runs, exception or not No
# Basic try/except try: number = int(input("Enter a number: ")) result = 100 / number except ValueError: print("Invalid input. Please enter a valid integer.") except ZeroDivisionError: print("Cannot divide by zero.")
# Full try/except/else/finally def read_config(filepath): try: f = open(filepath, "r") content = f.read() except FileNotFoundError: print(f"Config file not found: {filepath}") content = None except PermissionError: print(f"No permission to read: {filepath}") content = None else: # Only runs if no exception occurred print(f"Successfully loaded {len(content)} characters.") finally: # Always runs — ideal for cleanup try: f.close() except NameError: pass # f was never assigned if open() failed return content
Execution Flow:

try → if exception, jump to matching except; if no exception, run else. In both cases, finally always executes last. This makes finally perfect for releasing resources like file handles and database connections.

# Catching multiple exceptions in one handler try: value = data[key] result = int(value) except (KeyError, ValueError, TypeError) as e: print(f"Data error: {e}") result = None
# Using the exception object for details try: scores = [85, 92, 78] print(scores[10]) except IndexError as e: print(f"Error type: {type(e).__name__}") # IndexError print(f"Message: {e}") # list index out of range

Common Built-in Exceptions

Knowing which exception corresponds to which error is essential for the exam and for writing robust data code.

Exception Raised When Typical Data Scenario
ValueError Right type, wrong value int("abc"), float("N/A")
TypeError Wrong type for operation "5" + 3, passing wrong arg type
KeyError Missing dictionary key row["nonexistent_col"]
IndexError Index out of range data[100] on a 50-item list
FileNotFoundError File does not exist open("missing.csv")
ZeroDivisionError Division by zero Computing a ratio with a zero denominator
ImportError Module cannot be imported import nonexistent_lib
AttributeError Object has no such attribute Calling .append() on a tuple
NameError Variable not defined Using a variable before assignment
# Demonstrating common exceptions # ValueError try: age = int("twenty") except ValueError as e: print(f"ValueError: {e}") # ValueError: invalid literal for int() with base 10: 'twenty' # TypeError try: result = "price: " + 49.99 except TypeError as e: print(f"TypeError: {e}") # TypeError: can only concatenate str (not "float") to str # KeyError try: record = {"name": "Alice", "age": 30} email = record["email"] except KeyError as e: print(f"KeyError: missing key {e}") # KeyError: missing key 'email' # FileNotFoundError try: with open("data_2026.csv") as f: data = f.read() except FileNotFoundError: print("File not found. Check the file path.") # ZeroDivisionError try: conversion_rate = 0 / 0 except ZeroDivisionError: print("Cannot compute rate: division by zero.") conversion_rate = 0.0

Raising Exceptions and Custom Messages

Use the raise statement to intentionally trigger exceptions when your code detects invalid conditions. This is critical for input validation in data pipelines.

# Raising exceptions for validation def validate_age(age): if not isinstance(age, (int, float)): raise TypeError(f"Age must be numeric, got {type(age).__name__}") if age < 0 or age > 150: raise ValueError(f"Age must be between 0 and 150, got {age}") return True # Usage try: validate_age(-5) except ValueError as e: print(e) # Age must be between 0 and 150, got -5 try: validate_age("thirty") except TypeError as e: print(e) # Age must be numeric, got str
# Validating a dataset before processing def validate_dataset(df): """Validate that a DataFrame meets minimum requirements.""" required_cols = ["id", "date", "value"] if df.empty: raise ValueError("Dataset is empty.") missing = [col for col in required_cols if col not in df.columns] if missing: raise KeyError(f"Missing required columns: {missing}") if df["id"].duplicated().any(): raise ValueError("Duplicate IDs found in dataset.") return True
# Re-raising an exception after logging import logging def process_file(path): try: with open(path) as f: data = f.read() except FileNotFoundError: logging.error(f"File not found: {path}") raise # Re-raise the same exception

Interpreting Error Messages and Tracebacks

Python tracebacks are read bottom to top. The last line shows the exception type and message; lines above show the call stack with the most recent call at the bottom.

# Example traceback: # Traceback (most recent call last): # File "pipeline.py", line 45, in <module> # result = process_records(data) # File "pipeline.py", line 32, in process_records # cleaned = clean_value(record["price"]) # File "pipeline.py", line 18, in clean_value # return float(value) # ValueError: could not convert string to float: 'N/A'
Reading Tracebacks:
  1. Bottom line: The exception type (ValueError) and the message (could not convert string to float: 'N/A').
  2. Second from bottom: The exact line of code that caused the error — return float(value).
  3. Working upward: The chain of function calls that led to the error, with file names and line numbers.
# Common traceback patterns in data code: # 1. KeyError in pandas — misspelled column name # KeyError: 'reveneu' # Fix: Check df.columns and correct the spelling # 2. FileNotFoundError — wrong path or filename # FileNotFoundError: [Errno 2] No such file or directory: 'dta/sales.csv' # Fix: Check os.path.exists() and correct the path # 3. TypeError — operations on incompatible types # TypeError: unsupported operand type(s) for +: 'int' and 'str' # Fix: Convert types before operations # 4. ImportError — module not installed # ModuleNotFoundError: No module named 'sklearn' # Fix: pip install scikit-learn

Real-World Error Handling in Data Code

Data processing scripts frequently encounter malformed data, missing files, and network issues. Here are practical patterns for robust data code.

Reading Files Safely

import csv import os def load_csv_safely(filepath): """Load a CSV file with comprehensive error handling.""" if not os.path.exists(filepath): print(f"Error: File '{filepath}' does not exist.") return [] if not filepath.endswith(".csv"): print(f"Warning: '{filepath}' may not be a CSV file.") rows = [] try: with open(filepath, "r", encoding="utf-8") as f: reader = csv.DictReader(f) for i, row in enumerate(reader): rows.append(row) except UnicodeDecodeError: print("Encoding error. Trying latin-1...") with open(filepath, "r", encoding="latin-1") as f: reader = csv.DictReader(f) rows = list(reader) except csv.Error as e: print(f"CSV parsing error: {e}") else: print(f"Successfully loaded {len(rows)} rows.") return rows

Parsing Data with Type Conversion

def parse_numeric_column(records, column): """Safely convert a column to float, handling errors.""" parsed = [] errors = [] for i, record in enumerate(records): try: value = record[column] parsed.append(float(value)) except KeyError: errors.append(f"Row {i}: column '{column}' not found") except (ValueError, TypeError): errors.append(f"Row {i}: cannot convert '{record.get(column)}' to float") if errors: print(f"Encountered {len(errors)} errors in column '{column}':") for err in errors[:5]: # Show first 5 errors print(f" - {err}") return parsed

Handling Common Pandas Errors

import pandas as pd # Safe file loading with pandas def load_dataframe(filepath): try: df = pd.read_csv(filepath) except FileNotFoundError: print(f"File not found: {filepath}") return pd.DataFrame() except pd.errors.EmptyDataError: print(f"File is empty: {filepath}") return pd.DataFrame() except pd.errors.ParserError as e: print(f"Parse error: {e}") return pd.DataFrame() else: print(f"Loaded DataFrame: {df.shape[0]} rows, {df.shape[1]} columns") return df # Safe column access def safe_column_mean(df, column): try: return df[column].mean() except KeyError: print(f"Column '{column}' not found. Available: {list(df.columns)}") return None except TypeError: print(f"Column '{column}' is not numeric.") return None

Complete Data Pipeline with Error Handling

import csv import statistics import os def analyze_sales(filepath): """End-to-end pipeline with robust error handling.""" # Step 1: Load data try: with open(filepath, "r") as f: reader = csv.DictReader(f) records = list(reader) except FileNotFoundError: raise FileNotFoundError(f"Sales file not found: {filepath}") if not records: raise ValueError("No records found in file.") # Step 2: Parse and validate revenue values revenues = [] skipped = 0 for row in records: try: revenue = float(row["revenue"]) if revenue < 0: raise ValueError("Negative revenue") revenues.append(revenue) except (ValueError, KeyError): skipped += 1 continue # Step 3: Compute statistics try: result = { "total": sum(revenues), "mean": statistics.mean(revenues), "median": statistics.median(revenues), "stdev": statistics.stdev(revenues), "count": len(revenues), "skipped": skipped, } except statistics.StatisticsError: print("Not enough data to compute statistics.") return None return result

Best Practices for Robust Scripts

Catch Specific Exceptions, Not Bare except

A bare except: catches everything, including KeyboardInterrupt and SystemExit, which makes it impossible to stop your program. Always specify exception types.

# BAD — bare except catches everything, hides bugs try: result = process(data) except: pass # GOOD — catch specific exceptions try: result = process(data) except (ValueError, TypeError) as e: print(f"Processing error: {e}") result = None # ACCEPTABLE — catch Exception (still skips SystemExit, KeyboardInterrupt) try: result = process(data) except Exception as e: logging.error(f"Unexpected error: {e}") result = None
Use finally for Resource Cleanup

Always close files, database connections, and network sockets in a finally block (or use with statements which handle this automatically).

# Best practice: use context managers (with statement) with open("data.csv") as f: data = f.read() # File is automatically closed, even if an exception occurs # Equivalent manual approach with finally f = None try: f = open("data.csv") data = f.read() finally: if f: f.close()
Validate Inputs Early

Check data types, ranges, and required fields at the start of functions. Raise descriptive exceptions for invalid inputs rather than letting cryptic errors surface later in the pipeline.

def calculate_growth_rate(current, previous): """Calculate percentage growth between two values.""" if not isinstance(current, (int, float)): raise TypeError(f"current must be numeric, got {type(current).__name__}") if not isinstance(previous, (int, float)): raise TypeError(f"previous must be numeric, got {type(previous).__name__}") if previous == 0: raise ValueError("previous cannot be zero (division by zero).") return ((current - previous) / previous) * 100
Log Errors, Do Not Silence Them

Swallowing exceptions with pass masks bugs. At minimum, log the error so it can be diagnosed later.

Use else to Separate Success Logic

Code in the else block only runs when try succeeds. This keeps error-handling code separate from normal flow, improving readability.

Practice Quiz: Module Management and Exception Handling

Test your knowledge with 10 multiple-choice questions. Click an option to see if it is correct.

Q1. What is the correct way to import only the sqrt function from the math module?
A) import sqrt from math
B) from math import sqrt
C) import math.sqrt
D) math import sqrt
Correct: B. The syntax from module import name imports a specific name from a module. Option A reverses the order. Option C would import the math module (not sqrt directly). Option D is invalid syntax.
Q2. What is the difference between pip list and pip freeze?
A) pip list shows only standard library modules; pip freeze shows all
B) They are identical commands with different names
C) pip list shows a human-readable table; pip freeze outputs in package==version format for requirements files
D) pip freeze locks packages so they cannot be upgraded
Correct: C. pip list displays installed packages in a formatted table. pip freeze outputs them in package==version format, which can be redirected to a requirements.txt file for environment reproducibility.
Q3. What exception is raised by int("hello")?
A) TypeError
B) ValueError
C) NameError
D) SyntaxError
Correct: B. int() receives a string (correct type), but "hello" is not a valid integer representation (wrong value). This triggers a ValueError. A TypeError would occur if you passed a type that int() cannot convert at all (e.g., a list).
Q4. When does the else block execute in a try/except/else/finally structure?
A) When an exception is caught by except
B) Always, regardless of exceptions
C) Only when no exception was raised in the try block
D) Only when a finally block is also present
Correct: C. The else clause runs only if the try block completes without raising any exception. It is useful for code that should run on success but should not be inside the try block (to avoid accidentally catching its exceptions).
Q5. What does the following code print?
import numpy as np
print(type(np))
A) <class 'numpy'>
B) <class 'module'>
C) <class 'package'>
D) <class 'alias'>
Correct: B. When you import a module (with or without an alias), the variable refers to a module object. The alias np is simply an alternate name for the same module object. type(np) returns <class 'module'>.
Q6. Which pip command generates a file that can recreate the current environment?
A) pip list > requirements.txt
B) pip freeze > requirements.txt
C) pip export > requirements.txt
D) pip save > requirements.txt
Correct: B. pip freeze outputs packages in the package==version format that pip install -r requirements.txt expects. pip list produces a formatted table that is not directly usable by pip install -r. Options C and D are not valid pip commands.
Q7. What happens when you run the following code?
try:
    x = 10 / 0
except ZeroDivisionError:
    print("A")
else:
    print("B")
finally:
    print("C")
A) Prints: A B C
B) Prints: A C
C) Prints: A
D) Prints: B C
Correct: B. The try block raises ZeroDivisionError, so the except block runs and prints "A". Since an exception occurred, the else block is skipped. The finally block always runs, printing "C". Result: A C.
Q8. What is the primary risk of using from module import *?
A) It is slower than a standard import
B) It only works with standard library modules
C) It can cause namespace collisions and makes it unclear where names originate
D) It installs the module from PyPI automatically
Correct: C. Wildcard imports dump all public names from a module into the current namespace. If two modules export a function with the same name, the second import silently overwrites the first. It also makes code harder to read because the origin of each name is unclear.
Q9. What exception type would you use to validate that a function argument is a positive number?
def set_quantity(qty):
    if qty <= 0:
        raise ???("Quantity must be positive.")
    return qty
A) TypeError
B) ValueError
C) KeyError
D) IndexError
Correct: B. The argument is the right type (a number), but the value is invalid (non-positive). ValueError is appropriate when the type is correct but the value does not meet requirements. TypeError would be used if the argument was the wrong type entirely (e.g., a string instead of a number).
Q10. Which standard library module provides Counter and defaultdict?
A) statistics
B) itertools
C) collections
D) functools
Correct: C. The collections module provides specialized container types including Counter (for counting hashable objects), defaultdict (dict with default factory), OrderedDict, namedtuple, and deque.

Navigation

2.2.1 Modules & PIP Import Styles Standard Library PIP Commands Custom Modules 2.2.2 Exception Handling try/except/else/finally Common Exceptions Raising Exceptions Tracebacks Real-World Patterns Best Practices Practice Quiz