Packet classification is a building block in many network services such as routing, filtering, intrusion detection, accounting, monitoring, load-balancing and policy enforcement. Compression has gained attention recently as a way to deal with the expected increase of classifiers size. Typically, compression schemes try to reduce a classifier size while keeping it semantically-equivalent to its original form. Inspired by the advantages of popular compression schemes (e.g. JPEG and MPEG), we study in this paper the applicability of lossy compression to create packet classifiers requiring less memory than optimal semantically-equivalent representations. Our objective is to find a limited-size classifier that can correctly classify a high portion of the traffic so that it can be implemented in commodity switches with classification modules of a given size. We develop optimal dynamic programming based algorithms for several versions of the problem and describe how a small amount of traffic that cannot be classified can be easily treated, especially in software-defined networks. We generalize our solutions for a wide range of classifiers with different similarity metrics. We evaluate their performance on real classifiers and traffic traces and show that in some cases we can reduce a classifier size by orders of magnitude while still classifying almost all traffic correctly.