Day 9: Standard Library Utilities
What You'll Learn Today
- collections module
- itertools module
- functools module
- re (regular expressions) module
- logging module
- argparse module
collections Module
Provides advanced data structures.
Counter
Count element occurrences:
from collections import Counter
# Count characters
text = "hello world"
counter = Counter(text)
print(counter) # Counter({'l': 3, 'o': 2, 'h': 1, ...})
# Count list items
words = ['apple', 'banana', 'apple', 'cherry', 'banana', 'apple']
word_count = Counter(words)
print(word_count) # Counter({'apple': 3, 'banana': 2, 'cherry': 1})
# Most common elements
print(word_count.most_common(2)) # [('apple', 3), ('banana', 2)]
# Addition
counter1 = Counter(['a', 'b', 'a'])
counter2 = Counter(['a', 'c'])
print(counter1 + counter2) # Counter({'a': 3, 'b': 1, 'c': 1})
defaultdict
Provide default values for missing keys:
from collections import defaultdict
# List as default value
groups = defaultdict(list)
students = [
('A', 'Taro'),
('B', 'Hanako'),
('A', 'Jiro'),
('B', 'Yuki')
]
for group, name in students:
groups[group].append(name)
print(dict(groups)) # {'A': ['Taro', 'Jiro'], 'B': ['Hanako', 'Yuki']}
# int as default value (counter)
counter = defaultdict(int)
for word in ['apple', 'banana', 'apple']:
counter[word] += 1
print(dict(counter)) # {'apple': 2, 'banana': 1}
namedtuple
Named tuples:
from collections import namedtuple
# Definition
Point = namedtuple('Point', ['x', 'y'])
Person = namedtuple('Person', 'name age city')
# Creation
p = Point(3, 4)
print(p.x, p.y) # 3 4
person = Person('Taro', 25, 'Tokyo')
print(person.name) # Taro
# Works as tuple
print(p[0], p[1]) # 3 4
# Unpacking
x, y = p
print(x, y) # 3 4
deque
Double-ended queue (efficient add/remove):
from collections import deque
# Create
d = deque([1, 2, 3])
# Add/remove from both ends
d.append(4) # Add right: [1, 2, 3, 4]
d.appendleft(0) # Add left: [0, 1, 2, 3, 4]
d.pop() # Remove right: 4
d.popleft() # Remove left: 0
# Max size (old elements auto-removed)
d = deque(maxlen=3)
d.extend([1, 2, 3, 4, 5])
print(d) # deque([3, 4, 5], maxlen=3)
# Rotation
d = deque([1, 2, 3, 4, 5])
d.rotate(2) # Rotate right: [4, 5, 1, 2, 3]
d.rotate(-2) # Rotate left: [1, 2, 3, 4, 5]
itertools Module
Tools for efficient iterator operations.
flowchart TB
subgraph Itertools["itertools Module"]
A["Infinite Iterators"]
B["Combinatorics"]
C["Filtering"]
end
A --> A1["count, cycle, repeat"]
B --> B1["permutations, combinations"]
C --> C1["filterfalse, takewhile"]
style Itertools fill:#3b82f6,color:#fff
Combinations
from itertools import permutations, combinations, product
# Permutations (arrangements)
items = ['A', 'B', 'C']
perms = list(permutations(items, 2))
print(perms) # [('A', 'B'), ('A', 'C'), ('B', 'A'), ('B', 'C'), ...]
# Combinations (order doesn't matter)
combs = list(combinations(items, 2))
print(combs) # [('A', 'B'), ('A', 'C'), ('B', 'C')]
# Cartesian product
colors = ['red', 'blue']
sizes = ['S', 'M', 'L']
prod = list(product(colors, sizes))
print(prod) # [('red', 'S'), ('red', 'M'), ..., ('blue', 'L')]
Chain and Group
from itertools import chain, groupby
# Chain multiple iterables
list1 = [1, 2, 3]
list2 = [4, 5, 6]
chained = list(chain(list1, list2))
print(chained) # [1, 2, 3, 4, 5, 6]
# Groupby (requires pre-sorting)
data = [
('A', 1), ('A', 2), ('B', 3), ('B', 4), ('A', 5)
]
data.sort(key=lambda x: x[0]) # Sort first
for key, group in groupby(data, key=lambda x: x[0]):
print(key, list(group))
# A [('A', 1), ('A', 2), ('A', 5)]
# B [('B', 3), ('B', 4)]
Accumulate and Compress
from itertools import accumulate, compress
# Cumulative sum
numbers = [1, 2, 3, 4, 5]
acc = list(accumulate(numbers))
print(acc) # [1, 3, 6, 10, 15]
# Cumulative product
import operator
acc_prod = list(accumulate(numbers, operator.mul))
print(acc_prod) # [1, 2, 6, 24, 120]
# Filtering with selectors
data = ['a', 'b', 'c', 'd']
selectors = [True, False, True, False]
result = list(compress(data, selectors))
print(result) # ['a', 'c']
functools Module
Tools for working with functions.
reduce
Fold a list into a single value:
from functools import reduce
# Sum
numbers = [1, 2, 3, 4, 5]
total = reduce(lambda x, y: x + y, numbers)
print(total) # 15
# Find maximum
max_val = reduce(lambda x, y: x if x > y else y, numbers)
print(max_val) # 5
# Factorial
factorial = reduce(lambda x, y: x * y, range(1, 6))
print(factorial) # 120
lru_cache
Memoization (result caching):
from functools import lru_cache
@lru_cache(maxsize=128)
def fibonacci(n):
if n < 2:
return n
return fibonacci(n - 1) + fibonacci(n - 2)
print(fibonacci(100)) # 354224848179261915075
# Cache info
print(fibonacci.cache_info())
# CacheInfo(hits=98, misses=101, maxsize=128, currsize=101)
# Clear cache
fibonacci.cache_clear()
partial
Fix some function arguments:
from functools import partial
def power(base, exponent):
return base ** exponent
# Square function
square = partial(power, exponent=2)
print(square(5)) # 25
# Cube function
cube = partial(power, exponent=3)
print(cube(5)) # 125
re (Regular Expressions) Module
Powerful pattern matching tools.
Basic Usage
import re
text = "My phone number is 555-123-4567."
# Pattern search
pattern = r'\d{3}-\d{3}-\d{4}'
match = re.search(pattern, text)
if match:
print(match.group()) # 555-123-4567
# Find all matches
text = "Call 555-123-4567 or 555-987-6543"
matches = re.findall(pattern, text)
print(matches) # ['555-123-4567', '555-987-6543']
Common Patterns
| Pattern | Meaning |
|---|---|
\d |
Digit |
\w |
Word character (alphanumeric + _) |
\s |
Whitespace |
. |
Any single character |
* |
0 or more repetitions |
+ |
1 or more repetitions |
? |
0 or 1 occurrence |
{n} |
Exactly n repetitions |
^ |
Start of line |
$ |
End of line |
Replace and Split
import re
# Replace
text = "Hello World"
result = re.sub(r'World', 'Python', text)
print(result) # Hello Python
# Multiple replacements
text = "2024-01-15"
result = re.sub(r'-', '/', text)
print(result) # 2024/01/15
# Split
text = "apple,banana;cherry:date"
parts = re.split(r'[,;:]', text)
print(parts) # ['apple', 'banana', 'cherry', 'date']
Grouping
import re
text = "Name: Taro, Age: 25"
pattern = r'Name: (\w+), Age: (\d+)'
match = re.search(pattern, text)
if match:
print(match.group(0)) # Name: Taro, Age: 25
print(match.group(1)) # Taro
print(match.group(2)) # 25
print(match.groups()) # ('Taro', '25')
# Named groups
pattern = r'Name: (?P<name>\w+), Age: (?P<age>\d+)'
match = re.search(pattern, text)
if match:
print(match.group('name')) # Taro
print(match.group('age')) # 25
logging Module
Manage application logs.
import logging
# Basic configuration
logging.basicConfig(
level=logging.DEBUG,
format='%(asctime)s - %(levelname)s - %(message)s'
)
# Log messages
logging.debug('Debug information')
logging.info('Info message')
logging.warning('Warning message')
logging.error('Error message')
logging.critical('Critical error')
Log Levels
| Level | Value | Purpose |
|---|---|---|
| DEBUG | 10 | Detailed diagnostic info |
| INFO | 20 | General information |
| WARNING | 30 | Warning (default) |
| ERROR | 40 | Error |
| CRITICAL | 50 | Fatal error |
Logging to File
import logging
# Output to both file and console
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
handlers=[
logging.FileHandler('app.log'),
logging.StreamHandler()
]
)
logger = logging.getLogger(__name__)
logger.info('Application started')
argparse Module
Parse command-line arguments.
import argparse
# Create parser
parser = argparse.ArgumentParser(description='Sample program')
# Add arguments
parser.add_argument('filename', help='File to process')
parser.add_argument('-o', '--output', help='Output file', default='output.txt')
parser.add_argument('-v', '--verbose', action='store_true', help='Verbose output')
parser.add_argument('-n', '--number', type=int, default=10, help='Number to process')
# Parse arguments
args = parser.parse_args()
print(f"Input file: {args.filename}")
print(f"Output file: {args.output}")
print(f"Verbose mode: {args.verbose}")
print(f"Number: {args.number}")
Usage:
python script.py input.txt -o result.txt -v -n 20
Summary
| Module | Purpose | Key Features |
|---|---|---|
| collections | Data structures | Counter, defaultdict, deque |
| itertools | Iterator operations | permutations, combinations |
| functools | Function tools | reduce, lru_cache, partial |
| re | Regular expressions | search, findall, sub |
| logging | Log management | debug, info, error |
| argparse | CLI argument parsing | add_argument, parse_args |
Key Takeaways
Countermakes counting elements easydefaultdicteliminates key existence checkslru_cachespeeds up recursive functions- Regular expressions for complex pattern matching
loggingis better thanprintfor proper logging
Practice Exercises
Exercise 1: Counter
Write a program that reads a text file and displays the top 10 most frequent words.
Exercise 2: Regular Expressions
Write a program that extracts all email addresses from a text containing multiple emails.
Challenge
Create a CLI tool using argparse with the following features:
- Accept filename as argument
--countoption to count words--linesoption to count lines--searchoption to search for a word
References
Next Up: In Day 10, you'll work on "Practical Projects." Apply everything you've learned to build real programs!