This tutorial explains about the Collections module in python in detail. The python ‘collections’ module is a powerful utility that provides specialized containers datatypes beyond the built-in data structures(like lists, dictionaries ,sets, tuples).
It offers additional data structures such ,
- counter
- defaultdict
- deque
- namedtuple
- OrderedDict
- ChainMap
The above are some of the classes that were introduced in python language.
Counter
The “Counter” class in python collections module is used for counting the occurrence of the elements in collections. It’s essentially a specialized dictionary designed for counting hashable objects. Here’s a breakdown of its key features:
- Counting Elements: The primary purpose of Counter is to count the occurrences of elements in a collection, typically iterables like lists or strings.For instance,
mylist = [1, 1, 2, 3, 4, 4, 5, 5, 5] count = mylist.count(elementstobecount)
- Dictionary-Like Interface: Counter behaves like a dictionary, where the elements are stored as keys and their counts as values. For example,
from collections import Counter # Create a Counter object by passing a list of elements counts = Counter(['dog', 'cat', 'dog', 'fox', 'cat', 'dog']) # Access the counts of specific elements using square brackets print(counts['dog']) # Output: 3 print(counts['cat']) # Output: 2
- Arithmetic Operations: Counter supports arithmetic operations like addition, subtraction, intersection, and union.
count1 = Counter({'a': 3, 'b': 1}) count2 = Counter({'a': 1, 'b': 2}) # Addition print(count1 + count2) # Output: Counter({'a': 4, 'b': 3}) # Subtraction print(count1 - count2) # Output: Counter({'a': 2}) # Intersection (minimum of corresponding counts) print(count1 & count2) # Output: Counter({'a': 1, 'b': 1}) # Union (maximum of corresponding counts) print(count1 | count2) # Output: Counter({'a': 3, 'b': 2})
- Useful Methods: Counter provides additional methods such as most_common() to retrieve the most common elements and elements() to return an iterator over the elements repeated according to their counts.
defaultdict
A defaultdict is a specialized dictionary-like container provided by Python’s collections module. It’s similar to the built-in dict type, but with one key difference: it automatically creates missing keys and initializes their values based on a default factory function provided by the user.
- Initialization: When you create a defaultdict, you provide it with a default factory function that defines the initial value for any missing key. This factory function can be any callable object, such as a function or a lambda expression. If no factory function is provided, the default value for missing keys will be None.
- Automatic Key Creation: When you try to access or modify a key that doesn’t exist in the defaultdict, instead of raising a keyerror as a regular dictionary would, a new key-value pair is automatically created. The value for the new key is initialized using the default factory function.
- Use Cases: defaultdict is particularly useful in scenarios where you need to handle missing keys gracefully, without having to explicitly check for their existence before accessing or modifying them. It simplifies code and makes it more concise.
from collections import defaultdict # Sample string text = "hello codespeedy" # Create a defaultdict with default factory as int (defaults to 0) char_count = defaultdict(int) # Count the occurrences of each character in the string for char in text: char_count[char] += 1 # Print the character count for char, count in sorted(char_count.items()): print(f"Character '{char}' occurs {count} times.")
Character ' ' occurs 1 times. Character 'c' occurs 1 times. Character 'd' occurs 1 times. Character 'e' occurs 3 times. Character 'h' occurs 1 times. Character 'l' occurs 2 times. Character 'o' occurs 2 times. Character 'p' occurs 1 times. Character 's' occurs 1 times. Character 'y' occurs 1 times.
deque
The deque class, an abbreviation for “double-ended queue,” forms an integral component of Python’s collections module. Offering a flexible data structure, it facilitates swift additions and removals from either end of the queue, rendering it highly adept for queue and stack implementations.
Some key features of deque,
- Fast Operations
- Memory Efficiency
- Thread Safety
Versatility
from collections import deque # Initialize a deque queue = deque() # Enqueue elements queue.append(1) queue.append(2) queue.append(3) # Dequeue elements print(queue.popleft()) print(queue.popleft()) # Current queue print(queue)
namedtuple
A namedtuple in Python is provided by the collections module that allows you to create tuple subclasses with named fields. It is like a regular tuple, but with named fields. It is great for greating lightweight,immutable data structures. You can access the fields using dot notation instead of indexing.
from collections import namedtuple Person = namedtuple('Person', ['name', 'age','city']) person1 = Person('Alice', 25, 'New York') person2 = Person('Bob', 30, 'San Francisco') print(person1.name) print(person2.age) print(person1.city)
The output:
Alice 30 New York
OrderedDict
OrderedDict, a distinctive dictionary subclass offered within Python’s collections module. It’s similar to a regular dictionary, but it maintains the order of the keys as they were inserted. This can be helpful when you need to preserve the order of elements in your dictionary.
from collections import OrderedDict # Create an empty OrderedDict my_dict = OrderedDict() # Add key-value pairs to the OrderedDict my_dict['dog'] = 3 my_dict['cat'] = 2 my_dict['cow'] = 5 # Print the OrderedDict print(my_dict)
The output:
OrderedDict([('dog', 3), ('cat', 2), ('cow', 5)])
the order of the keys is preserved in the OrderedDict. If you were to use a regular dictionary, the order of the keys might not be maintained.
ChainMap
A ChainMap is a data structure provided by Python’s collections module that encapsulates multiple dictionaries into a single mapping. It’s used to combine multiple dictionaries into a single dictionary-like object. It allows you to access and manipulate multiple dictionaries as if they were a single dictionary.
from collections import ChainMap # Create two dictionaries dictionary1 = {'dog': 3, 'cat': 2} dictionary2 = {'cow': 5, 'fox': 4} # Create a ChainMap with the dictionaries combined_dict = ChainMap(dict1, dict2) # Access and modify the combined dictionary print(combined_dict['dog']) print(combined_dict['cow']) combined_dict['cat'] = 1 print(combined_dict['cat']) # Accessing a key not present in the first dictionary falls back to the second dictionary print(combined_dict['fox'])
The output:
3 5 1 4
To summarize ,The Python ‘collections’ module provides specialized data structures beyond the standard containers. It enhances efficiency and functionality with types like Counter, defaultdict, and deque and many more. These structures offer solutions for common programming tasks, such as counting elements, handling missing keys, and managing multiple dictionaries. The module’s versatility and ease of use make it an essential tool for developers across various domains.