19. Binary Data Handling

The struct module in Python provides tools to work with binary data, especially when you need to pack and unpack data into binary formats for file storage or communication. It is commonly used for reading and writing binary files or for working with binary data in various formats (like network protocols, image formats, or audio files).

Key Functions of struct Module:

struct.pack(): Converts data into binary representation.
struct.unpack(): Converts binary data into Python data types.
struct.calcsize(): Returns the size of a struct (in bytes).

Here are some Python code examples demonstrating binary data handling using the struct module:

1. Packing and Unpacking Data

This example demonstrates how to pack data into a binary format and then unpack it back into Python data types.

Copy

import struct

# Packing data into binary
data = (1, 2.5, b'abc')
packed_data = struct.pack('I f 3s', *data)  # 'I' for int, 'f' for float, '3s' for 3-byte string
print(f"Packed Data: {packed_data}")

# Unpacking binary data back to Python data types
unpacked_data = struct.unpack('I f 3s', packed_data)
print(f"Unpacked Data: {unpacked_data}")

Explanation:

The format string 'I f 3s' specifies the data types:
- I: unsigned integer (4 bytes)
- f: float (4 bytes)
- 3s: string of 3 characters
pack() and unpack() allow conversion between Python data and binary data.

2. Reading and Writing Binary Files

This example shows how to read and write binary data to a file.

Writing binary data to a file:

Copy

import struct

# Data to write
data = (1, 2.5, b'abc')

# Open a file in write-binary mode
with open('binary_data.dat', 'wb') as f:
    packed_data = struct.pack('I f 3s', *data)
    f.write(packed_data)
    print("Data written to binary file")

Reading binary data from a file:

Copy

import struct

# Open the binary file in read-binary mode
with open('binary_data.dat', 'rb') as f:
    packed_data = f.read()
    unpacked_data = struct.unpack('I f 3s', packed_data)
    print(f"Unpacked Data from file: {unpacked_data}")

Explanation:

Writing: struct.pack() converts the data into a binary format, which is then written to the file using the write() method.
Reading: The read() method reads the binary data from the file, which is then unpacked back into its original form using struct.unpack().

3. Handling Binary Data with Variable Length Strings

Sometimes you may need to handle binary data with strings of variable length. Here is an example of packing and unpacking binary data with variable-length strings.

Copy

import struct

# Data with a variable length string
data = (123, b"Hello, World!")

# Pack the data with a dynamic string length
packed_data = struct.pack('I 13s', data[0], data[1])
print(f"Packed Data: {packed_data}")

# Unpack the binary data
unpacked_data = struct.unpack('I 13s', packed_data)
print(f"Unpacked Data: {unpacked_data}")

Explanation:

The format string I 13s specifies an integer (I) and a fixed-length string of 13 characters (13s).
This example assumes the string will always be 13 characters. If the length varies, you might need to adjust the format string accordingly or use dynamic unpacking techniques.

4. Using calcsize to Determine Structure Size

The struct.calcsize() function can be used to determine the size of the struct format.

Copy

import struct

# Define a format string for a struct
format_string = 'I f 3s'

# Get the size of the struct
size = struct.calcsize(format_string)
print(f"Size of the struct: {size} bytes")

Explanation:

struct.calcsize() returns the number of bytes needed to store the struct as defined by the format string. This can be helpful when you need to manage memory usage or align data correctly.

5. Working with Packed Binary Data for Networking

In networking applications, you may need to send and receive binary data. Here’s how to handle such scenarios:

Packing data for network transmission:

Copy

import struct

# Prepare data to send (integer, float, string)
data = (101, 3.14, b'hello')

# Pack the data into a binary string
packed_data = struct.pack('I f 5s', *data)

# Send this packed data over the network (simulated)
print(f"Packed Data: {packed_data}")

Unpacking received binary data:

Copy

import struct

# Simulate receiving packed binary data
received_data = packed_data  # In real use, this would come from a socket

# Unpack the data
unpacked_data = struct.unpack('I f 5s', received_data)
print(f"Unpacked Data: {unpacked_data}")

Explanation:

This is a simple simulation of how binary data might be packed for network transmission using the struct module and later unpacked when received.
The 5s format specifies a string of 5 characters, which is typically used in network protocols where the string length is fixed.

6. Packing and Unpacking Multiple Entries

You can pack and unpack multiple entries at once. This example shows how to handle multiple entries in a binary format.

Copy

import struct

# Data: list of integers and floats
data = [(1, 2.5), (2, 3.6), (3, 4.7)]

# Packing multiple entries
packed_data = struct.pack('I f' * len(data), *[item for sublist in data for item in sublist])
print(f"Packed Data: {packed_data}")

# Unpacking multiple entries
unpacked_data = struct.unpack('I f' * len(data), packed_data)
print(f"Unpacked Data: {unpacked_data}")

Explanation:

By repeating the format I f for each entry in the list, we can pack and unpack multiple records. Each record consists of an integer and a float.

7. Handling Big Endian and Little Endian Data

In some binary file formats or network protocols, you might encounter big-endian or little-endian byte orders.

Copy

import struct

# Big-endian (network byte order) packing
data = (1, 2.5)
packed_data = struct.pack('!I f', *data)  # '!' specifies network (big-endian) order
print(f"Packed Big-endian Data: {packed_data}")

# Unpacking big-endian data
unpacked_data = struct.unpack('!I f', packed_data)
print(f"Unpacked Big-endian Data: {unpacked_data}")

Explanation:

The format string ! is used to specify network (big-endian) byte order. You can use this when dealing with network protocols where byte order is standardized.

8. Working with Signed and Unsigned Integers

You can specify signed or unsigned integers while packing and unpacking data.

Copy

import struct

# Packing signed and unsigned integers
data = (123, -456)
packed_data = struct.pack('I i', data[0], data[1])
print(f"Packed Data: {packed_data}")

# Unpacking data
unpacked_data = struct.unpack('I i', packed_data)
print(f"Unpacked Data: {unpacked_data}")

Explanation:

I represents an unsigned integer (4 bytes), and i represents a signed integer (also 4 bytes).
This is useful when you need to handle both positive and negative numbers in your binary data.

9. Handling Floats with Precision

You can control the precision of floating-point numbers when packing and unpacking them.

Copy

import struct

# Packing a float with specified precision
data = (3.1415926535,)
packed_data = struct.pack('d', data[0])  # 'd' specifies double precision float (8 bytes)
print(f"Packed Data: {packed_data}")

# Unpacking data
unpacked_data = struct.unpack('d', packed_data)
print(f"Unpacked Data: {unpacked_data}")

Explanation:

The d format specifies a double-precision floating-point number (8 bytes). This is useful for applications requiring high precision, such as scientific computing.

10. Padding with struct

If you need to add padding between struct elements for alignment, the struct module allows you to do this with a specific format.

Copy

import struct

# Packing data with padding
data = (1, b'hello')
packed_data = struct.pack('I 5s', *data)  # 5-byte string with padding
print(f"Packed Data with Padding: {packed_data}")

# Unpacking data with padding
unpacked_data = struct.unpack('I 5s', packed_data)
print(f"Unpacked Data with Padding: {unpacked_data}")

Explanation:

In this example, padding occurs automatically for the 5-byte string field. If the size of an element does not match its alignment requirements, Python will add padding to ensure proper alignment.

Previous18. Global Interpreter Lock (GIL)Next20. Custom Python REPL

Last updated 9 months ago