Collections module in Python

In this tutorial, I am going to discuss about collections module in python. We will discuss about different containers given by the collections module like Counters, OrderedDict, defaultdict, Deque, namedtuple, UserDict, UserList and UserString.

Collections module in python

The collections module in python provides us with different container datatypes which are used for data storage and processing similar to lists, tuples or dictionaries. These container datatypes have different functions and capabilities.

We will study the following collection containers in the collections module :

Types of Collection in python

Let's discuss about these collection datatypes :

Counter

Counter is a subclass of the dictionary class which uses hashing to keep count of the all items inserted. It creates a counter object which wraps a dictionary containing the items as keys and their number of occurrences as values. The following sample code shows how it works :

import collections

sample_list = [5, 1, 2, 10, 4, 10, 5, 5, 3, 10, 1]
print("Sample list of elements : ", sample_list)

counter = collections.Counter(sample_list)
print("\nCounter object for sample list : ",counter)
Sample list of elements :  [5, 1, 2, 10, 4, 10, 5, 5, 3, 10, 1]

Counter object for sample list :  Counter({5: 3, 10: 3, 1: 2, 2: 1, 4: 1, 3: 1})

OrderDict

OrderDict is also a subclass of dictionary which retains the order of keys as they are inserted. Even if we change the value of that key later it will still keep its position. And if we try to insert a key again, its current value will be overwritten by the new value.

The following code shows how it works :

import collections

dict1 = collections.OrderedDict({1:'a', 6:'f', 3:'c', 5:'e', 4:'d', 2:'b'})
print("Ordered Dictionary : ",dict1)
Ordered Dictionary :  OrderedDict([(1, 'a'), (6, 'f'), (3, 'c'), (5, 'e'), (4, 'd'), (2, 'b')])

DefaultDict

In collection module, defaultdict is another subclass of dictionary which is very similar to original dictionary. It takes a datatype as first argument which becomes the datatype of all the values for all keys in the dictionary. One difference between original dictionary and defaultdict is that in defaultdict does not give any error when we try to access any non-existent keys in the defaultdict.

The following code shows how it works :

import collections

defdict = collections.defaultdict(int)
defdict["one"] = 1
defdict["two"] = 2
defdict["three"] = 3
print(defdict)
defaultdict(, {'one': 1, 'two': 2, 'three': 3})

Deque

The deque class in collections module creates a data structure that has capabilities of a double-ended queue. A double-ended queue allows the user to add or remove elements from either ends of the queue. It is an enhanced version of simple queue or list.

The following code shows how it works :

import collections 

simple = [1, 'a', 2, 'b', 3, 'c', 4, 'd', 5, 'e']

dequ = collections.deque(simple)
print("Deque : ", dequ)

dequ.append(6)
dequ.appendleft('one')
print("After inserting elements :", dequ)

dequ.pop()
dequ.popleft()
print("After removing elements :", dequ)
Deque :  deque([1, 'a', 2, 'b', 3, 'c', 4, 'd', 5, 'e'])
After inserting elements : deque(['one', 1, 'a', 2, 'b', 3, 'c', 4, 'd', 5, 'e', 6])
After removing elements : deque([1, 'a', 2, 'b', 3, 'c', 4, 'd', 5, 'e'])

ChainMap

The collections module also has a ChainMap class which is used to group multiple dictionaries together and returns a list. ChainMap basically encapsulates multiple number of dictionaries into a single unit with no restriction on the number of dictionaries.

The following code shows how it works :

import collections

dict1 = {1:'a', 2:'b', 3:'c'}
dict2 = {4:'d', 5:'e', 6:'f'}

chain_map = collections.ChainMap(dict1, dict2)

print(chain_map)
ChainMap({1: 'a', 2: 'b', 3: 'c'}, {4: 'd', 5: 'e', 6: 'f'})

NamedTuple

The collections module has a function namedtuple which returns a subclass of tuple with named fields. Usually, elements from a tuple can be accessed using indexing which we need to remember. However, using namedtuple we can assign name to each element of the tuple and can access every element using that name.

The following code shows how it works :

import collections

Tupl = collections.namedtuple('Tupl', ['a','b','c'])
T1 = Tupl(1, 2, 3)
print("Named Tuple :", T1)
print("T1.b = ", T1.b)
Named Tuple : Tupl(a=1, b=2, c=3)
T1.b =  2

UserDict

UserDict is container like dictionary which acts as a wrapper for dictionaries. This is a container which can be used when user wants to create their own dictionary with a new functionality.
The following code shows how we can update the setitem method to raise an error when we try to update values of existing keys.

from collections import UserDict

class my_dict(UserDict):
    def __setitem__(self, key, item):
        if(key in self.data):
            raise RuntimeError("Key already exists... ")
        else:
            self.data[key] = item

dict = my_dict({'a':1, 'b':2, 'c':3, 'd':4})
dict['e'] = 5
print("Dictionary : ", dict)
dic['a'] = 75
Dictionary :  {'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5}

---------------------------------------------------------------------------

RuntimeError                              Traceback (most recent call last)

 in 
     11 dict['e'] = 5
     12 print("Dictionary : ", dict)
---> 13 dic['a'] = 75

 in __setitem__(self, key, item)
      4     def __setitem__(self, key, item):
      5         if(key in self.data):
----> 6             raise RuntimeError("Key already exists... ")
      7         else:
      8             self.data[key] = item

RuntimeError: Key already exists... 

As you can see in the above example, we have changed the setitem( ) which is called when we are setting value for a key in a dictionary and unlike a normal dictionary our object raises a RuntimeError when we try to update or modify value for an existing key.

UserList

UserList is a container which is similar to lists and acts as a wrapper for list objects. This is used when we want to change some behavior of the usual list objects according to our requirement.

The following code shows how it works.

from collections import UserList

class my_list(UserList):
    #remove
    def remove(self, s=None):
        raise RuntimeError("Deletion not supported !!!")

    def pop(self, s=None):
        raise RuntimeError("Deletion not supported !!!")

lst1 = my_list({1, 2, 3, 4, 5})
print("Original list : ",lst1)

#removing an element
lst1.remove(3)
Original list :  [1, 2, 3, 4, 5]

---------------------------------------------------------------------------

RuntimeError                              Traceback (most recent call last)

 in 
     13 
     14 #removing an element
---> 15 lst1.remove(3)

 in remove(self, s)
      4     #remove
      5     def remove(self, s=None):
----> 6         raise RuntimeError("Deletion not supported !!!")
      7 
      8     def pop(self, s=None):

RuntimeError: Deletion not supported !!!

As you can see in the above example, we have changed the remove( ) and pop( ) methods which are called when we want to remove or pop an element from a simple list and unlike a normal list our object raises a RuntimeError when we try to remove or pop an element from the list.

UserString

Similar to UserDict and UserList, UserString is also a container which is similar to string and acts as a wrapper for string objects. We can use this when we want to modify or add any functionalities to the usual string object as per our requirement.

The following code shows how it works.

from collections import UserString

class my_string(UserString) :

    def append(self, st):
        self.data = self.data + st

    def remove(self, st):
        self.data = self.data.replace(st, '')

str1 = my_string("Hello, and welcome to learn python.")
print("Original string : ", str1)

str1.append(" Great.")
print("String after appending : ", str1)

str1.remove('e')
print("String after removing 'e' character : ", str1)
Original string :  Hello, and welcome to learn python.
String after appending :  Hello, and welcome to learn python. Great.
String after removing 'e' character :  Hllo, and wlcom to larn python. Grat.

As you can see in the above example, we have added append( ) and remove( ) methods which perform additional functions for a string such as appending a substring at the end of the string and removing all occurrences of a given substring.

So, in this tutorial we have discussed about different containers given by the collections module. We have discussed about Counters, OrderedDict, defaultdict, Deque, namedtuple, UserDict, UserList and UserString containers. Let's conclude this tutorial here.
If you have any questions please comment below and also share your views and suggestions in the comment box.

Pooja is a programmer by profession and love to write articles on Python. She is highly interested in Python, Data Science and Machine Learning. Pooja holds B.E. Computer Engineering degree. She loves teaching and learning new thing.

Leave a Comment