As the two images, X and Y, are highly correlated, H(Y) in the equations above contain not only the complete information of Y, but also some partial information of X. Then, Equation-2 clearly suggests that, the key to achieve the desired entropy rate would be to set A(X) so that A(X) contains only that X information, which is completely non-overlapping with the X information already contained in Y. This can be achieved by an algoritm that is able to suppress most or all of of the A(X) that is already known or shared by Y.
  The key to design an algorithm as described in the previous paragraph, is to observe the following:
Given a maximum allowable distance range, D, that represents the absolute same-location pixel by pixel gray level difference between X and Y:
It should be noted that given the maximum allowable distance between the individual same-location pixels of two highly correlated images X and Y (and this could be extended to any kind of highly correlated data), the first three of the bulleted observations above can be readily observed, and an algorithm satisfying them can be produced. However, the fourth bulleted observation will lead to different implementations/algorithms for different data types, and thus, will influence the real performance of the algorithm.
The prior work by Pradhan explores possible algorithms for two highly correlated data sources which have pre-determined maximum Hamming distances between them. Hence, following the first three observations above (except that there is now data chunks instead of pixels), the data chunks are placed into cosets with each member of the same coset as distant from the other members of the same coset as possible. [2], [3]
As an example, given that the maximum Hamming distance betweeen two 3 bit data sets, call them D1 and D2, is 1, and that D1 is completely transmitted, and that D2 is to be partially transmitted, then, the 8 possible values of D2 are transmitted through placing them into 4 cosets: Each coset is formed by grouping the two values that have a Hamming distance of 3. Since D1 and D2 have a Hamming distance of at most 1, the decoder can easily select the correct D2 value, which is the one, that is closest -in Hamming distance- to the D1 value. [2]
ABSTRACT
INTRODUCTION
PROBLEM DESCRIPTION and PRIOR WORK
DATA SET
ALGORITHM
RESULTS
CONCLUSIONS
REFERENCES