Python Data Structures


Data structures in Python refer to ways in which data is organized, managed, and stored for efficient access and modification. Python provides several built-in data structures, each designed for different purposes. Here’s an overview of the most common data structures in Python:

1. Lists

  • Definition: A list is an ordered, mutable (modifiable) collection of items. Lists can contain elements of different data types.
  • Syntax: my_list = [1, 2, 3, "hello", 4.5]
  • Key Operations:
    • Access elements: my_list[0] (first element)
    • Modify elements: my_list[1] = 10
    • Add elements: my_list.append(100) or my_list.insert(2, 'new')
    • Remove elements: my_list.remove(100) or my_list.pop()
  • Use Case: Useful for storing ordered collections that might change over time.

2. Tuples

  • Definition: A tuple is an ordered, immutable (cannot be modified) collection of items. Once created, the items in a tuple cannot be changed.
  • Syntax: my_tuple = (1, 2, 3, "hello")
  • Key Operations:
    • Access elements: my_tuple[1]
    • Concatenate: my_tuple + (4, 5)
    • Find length: len(my_tuple)
  • Use Case: Best for representing fixed collections of related data that shouldn’t change.

3. Dictionaries

  • Definition: A dictionary is an unordered, mutable collection of key-value pairs. Keys must be unique and immutable (e.g., strings or numbers), while values can be of any type.
  • Syntax: my_dict = {"name": "Alice", "age": 25, "city": "New York"}
  • Key Operations:
    • Access value by key: my_dict["name"]
    • Add/Modify a key-value pair: my_dict["email"] = "alice@example.com"
    • Remove a key-value pair: my_dict.pop("age")
    • Iterate through keys/values: for key, value in my_dict.items()
  • Use Case: Ideal for looking up values based on a unique key, such as a user ID or a product code.

4. Sets

  • Definition: A set is an unordered, mutable collection of unique items. Duplicate values are automatically removed.
  • Syntax: my_set = {1, 2, 3, 4, 5}
  • Key Operations:
    • Add elements: my_set.add(6)
    • Remove elements: my_set.remove(3)
    • Set operations: Union (set1 | set2), Intersection (set1 & set2), Difference (set1 - set2)
  • Use Case: Useful for storing unique items and performing mathematical set operations.

5. Strings

  • Definition: Strings are sequences of characters. They are immutable, meaning they cannot be changed once created.
  • Syntax: my_string = "Hello, World!"
  • Key Operations:
    • Access elements: my_string[0] (first character)
    • Concatenate: my_string + " How are you?"
    • Slice: my_string[0:5] (returns 'Hello')
    • Find length: len(my_string)
  • Use Case: Represent and manipulate textual data.

6. Arrays

  • Definition: Arrays in Python can be used via the array module, which allows you to store elements of the same type. Lists are often more commonly used, but arrays provide memory efficiency when working with large amounts of data.
  • Syntax:
    from array import array my_array = array('i', [1, 2, 3, 4, 5])
  • Use Case: Efficient handling of large sequences of numeric data of the same type.

7. Queues

  • Definition: A queue is a collection where elements are added from one end (rear) and removed from the other end (front). Python has the queue module for implementing queues.
  • Types:
    • FIFO Queue (First In First Out): Elements are removed in the order they are added.
    • LIFO Queue (Last In First Out): Elements are removed in reverse order.
  • Syntax:
    from queue import Queue my_queue = Queue() my_queue.put(10) my_queue.get()
  • Use Case: Useful for tasks that need processing in the order they arrive, like task scheduling.

8. Stacks

  • Definition: A stack is a collection where elements are added and removed from the same end (Last In First Out, or LIFO).
  • Implementation: Can be implemented using a list.
    my_stack = [] my_stack.append(10) # Push my_stack.pop() # Pop
  • Use Case: Used in algorithms like depth-first search or for undo operations.

9. Linked Lists

  • Definition: A linked list consists of nodes, where each node contains data and a reference to the next node. Python does not have a built-in linked list, but it can be implemented using classes.
  • Syntax:
    class Node: def __init__(self, data): self.data = data self.next = None class LinkedList: def __init__(self): self.head = None
  • Use Case: Efficient for dynamic memory allocation where frequent insertions and deletions happen.

10. Heaps (Priority Queues)

  • Definition: A heap is a tree-based data structure that satisfies the heap property (max-heap or min-heap). The Python heapq module can be used to implement heaps.
  • Syntax:
    import heapq my_heap = [1, 3, 5, 7] heapq.heapify(my_heap) heapq.heappush(my_heap, 0) # Push heapq.heappop(my_heap) # Pop
  • Use Case: Useful for priority-based task scheduling or implementing efficient sorting algorithms.

Summary

  • Lists: Ordered, mutable collections.
  • Tuples: Ordered, immutable collections.
  • Dictionaries: Unordered key-value pairs.
  • Sets: Unordered, unique collections.
  • Strings: Immutable sequences of characters.
  • Arrays: Memory-efficient sequences of elements of the same type.
  • Queues: FIFO or LIFO data structures for sequential task handling.
  • Stacks: LIFO data structure used for tasks like undo operations.
  • Linked Lists: Dynamically allocated node-based structure.
  • Heaps: Tree-based priority queues.

These data structures form the foundation of how you manage and process data efficiently in Python.