Compressing and Decompressing a Text File
When the size of files becomes large, it’s a good idea to compress them. Compression is a process of reducing the overall size of a file. The process involves finding the redundant or repeating entries in a file and making them refer to a single entry. File compression is useful for web pages. You can reduce the size of a web page which makes loading your site much faster. You can also consider file compression when you are archiving or backing up your files, or when sharing large files to the internet to be available for download. You can use different compression utilities such as WinRar or WinZip. If you have seen files with .zip or .rar extensions, then those files are compressed and have a smaller size compared to their uncompressed version.
Compressing Files with GZipStream and DeflateStream Classes
The .NET Framework Class Library offers the System.IO.Compression namespace which contains classes and methods used to easily compress or decompress files. These classes can either use the GZIP or Deflate algorithm. In this lesson, we will simply create a compressed text file. You can either use the GZipStream or the DeflateStream classes when compressing or decompressing files. They work pretty much the same. These classes require an existing stream which in our case, a FileStream object that points to a file. Let’s take a look at how to properly use this classes. There will be two sets of code which correspond to the two classes. Create a new Console Application project and name it TextFileCompression.
Import the two required namespaces for our program.
using System.IO;
using System.IO.Compression;
We will first make a comparison of not compressing and compressing a file. Write the following code which simply creates a text file without using any compression mechanism.
static void Main(string[] args)
{
StringBuilder data = new StringBuilder();
for (int i = 0; i < 1000; i++)
{
data.AppendLine("The quick brown fox jumps over the lazy dog.");
}
Console.WriteLine("A compressed file was created!");
try
{
FileStream stream = new FileStream(@"C:compressedFile.txt",
FileMode.Create, FileAccess.Write);
StreamWriter writer = new StreamWriter(stream);
writer.Write(data.ToString());
writer.Close();
}
catch (IOException ex)
{
Console.WriteLine(ex.Message);
}
}
Example 1
Lines 3-8 creates 1000 lines of dummy text. Lines 12 to 25 creates a text file using the techniques we learned in the previous lessons. Note that we are not using any compression at this time. The created file as seen in line 14 is located in C:compressedFile.txt. If you will look at this file in this directory and check it’s file size, you can see that it is about 45KB. Now let’s use the compression classes from the System.IO.Compression namespace. Example 2 uses the GZipStream which uses GZIP compression and Example 3 uses DeflateStream for Deflate compression.
static void Main(string[] args)
{
StringBuilder data = new StringBuilder();
for (int i = 0; i < 1000; i++)
{
data.AppendLine("The quick brown fox jumps over the lazy dog.");
}
Console.WriteLine("A compressed file was created!");
try
{
FileStream stream = new FileStream(@"C:compressedFile.txt",
FileMode.Create, FileAccess.Write);
GZipStream gzip = new GZipStream(stream, CompressionMode.Compress);
StreamWriter writer = new StreamWriter(gzip);
writer.Write(data.ToString());
writer.Close();
}
catch (IOException ex)
{
Console.WriteLine(ex.Message);
}
}
Example 2
The code is almost the same as Example 1, except for lines 17 and 19. Line 17 creates a GZipStream object which is used for GZIP compression. In the constructor, we provided the FileStream object in line 14. The second parameter of the constructor is a value from the CompressionMode enumeration which contains two values, Compress and Decompress. When writing compressed files, we used the CompressionMode.Compress value. In line 19, instead of passing the FileStream to the StreamWriter, we pass the created GZipStream object. When we called the Write() method in Line 21, the data being written is automatically compressed using GZIP compression algorithm. Remember that the uncompressed file is about 45KB. Let’s look at the new size of the compressed file. You will see that the file is now 660 bytes or about 1KB. That’s 44KB lower than the uncompressed text file we created earlier. The amount of reduced size depends on the amount of redundancy in the file and its overall size. Since the file actually contains a single sentence repeated 1000 times, the algorithm simply makes the other 999 entries to refer to a single sentence. If you will open the compressed text file, you won’t be able to understand it.
As you will see in Example 3, using the DeflateStream is very similar to using the GZipStream class.
static void Main(string[] args)
{
StringBuilder data = new StringBuilder();
for (int i = 0; i < 1000; i++)
{
data.AppendLine("The quick brown fox jumps over the lazy dog.");
}
Console.WriteLine("A compressed file was created!");
try
{
FileStream stream = new FileStream(@"C:compressedFile.txt",
FileMode.Create, FileAccess.Write);
DeflateStream deflate = new DeflateStream(stream, CompressionMode.Compress);
StreamWriter writer = new StreamWriter(deflate);
writer.Write(data.ToString());
writer.Close();
}
catch (IOException ex)
{
Console.WriteLine(ex.Message);
}
}
Example 3
The file size of the created compressed text file is 640bytes with is just a little lower that the GZIP compressed file.
Decompressing Compressed File
You need to decompress the compressed files if you plan to read them. This is also an easy task thanks to the GZipStream and DeflateStream classes. Let’s first take a look at using the GZipStream class for decompression. Note that you need the right class to decompress a compressed file. For example, if the file is compressed using Deflate, then you must use the DeflateStream instead or an error will occur.
static void Main(string[] args)
{
try
{
FileStream stream = new FileStream(@"C:compressedFile.txt",
FileMode.Open, FileAccess.Read);
GZipStream gzip = new GZipStream(stream, CompressionMode.Decompress);
StreamReader reader = new StreamReader(gzip);
string contents = reader.ReadToEnd();
reader.Close();
Console.WriteLine(contents);
}
catch (IOException ex)
{
Console.WriteLine(ex.Message);
}
}
Example 4
Be sure the file you are decompressing was compressed using the GZipStream or else this code will fail. Lines 5-6 creates a FileStreamobject which points to the compressed text file. We provide FileMode.Open and FileAccess.Read values in the constructor because we are simply reading the contents of the file. Line 8 creates a GZipStream object. The only difference here, as you can see, is the second argument which specifies the compression mode. We provide the CompressionMode.Decompress value. This value is used to decompress the compressed data of the stream. We passed the created GZipStream object to the StreamReader object in line 10 which is the one who will read the contents of the file. Line 11 uses the ReadToEnd() method which decompresses the content of the compressed file and returns the content as a string. The uncompressed content is then displayed in line 14.
If you compressed your file using the DeflateStream class, then you must use the same class for decompressing it. Example 5 shows you how to use the DeflateStream class to decompress a file. It is just similar to using the GZipStream class.
static void Main(string[] args)
{
try
{
FileStream stream = new FileStream(@"C:compressedFile.txt",
FileMode.Open, FileAccess.Read);
DeflateStream deflate = new DeflateStream(stream, CompressionMode.Decompress);
StreamReader reader = new StreamReader(deflate);
string contents = reader.ReadToEnd();
reader.Close();
Console.WriteLine(contents);
}
catch (IOException ex)
{
Console.WriteLine(ex.Message);
}
}
Example 5
This lesson only shows you how to compress a simple text file. You can use the classes we discussed here on various types of files.