How to compress and decompress data using gzip in Python


Python offers a straightforward approach to compress and decompress text data using the gzip module. In this blog, we’ll explore how to do just that.

Compressing Text with gzip

Compression is particularly useful when dealing with large amounts of text data. Python’s gzip module makes this process remarkably simple. Here’s how you can compress a string of text:

Step 1: Import gzip

First, you need to import the gzip module:

import gzip

Step 2: Convert Text to Bytes

Since gzip works with bytes, you’ll need to convert your text into a byte format:

text = "This is the text that you want to compress."
text_bytes = text.encode('utf-8')

Step 3: Compress the Byte Data

Now, use gzip.compress() to compress your byte data:

compressed_data = gzip.compress(text_bytes)

You can also save this compressed data to a file if needed:

with open('compressed_file.gz', 'wb') as file:
file.write(compressed_data)

Decompressing Data with gzip

Receiving or reading compressed data is only half the journey. You’ll often need to decompress this data to utilize it:

Step 1: Read or Receive Compressed Data

Assume compressed_data is the gzip-compressed data you’ve received or read from a file.

Step 2: Decompress the Data

Use gzip.decompress() for decompression:

decompressed_data = gzip.decompress(compressed_data)

Step 3: Convert Back to a String

If the original data was a string, convert the decompressed byte data back to a string:

original_text = decompressed_data.decode('utf-8')

Author: robot learner
Reprint policy: All articles in this blog are used except for special statements CC BY 4.0 reprint policy. If reproduced, please indicate source robot learner !
  TOC