0
$\begingroup$

So I'm self studying information theory, and I have a few doubts on entropy and encoding as a whole. I'm trying to compress a simple 16bit signed int sequence of values the best I can.

I learned about entropy: a lower bound to the expected coded sequence length (bits). However, is that true in the absolute sense? For example, whenever I delta encode (the first value + the difference between each and the next), I get smaller numbers (which by pigeon hole principle means more data have the same value) and a lower entropy value. That is, I managed to exploit a property of the data and losslessly encode the data below its original entropy.

My question is: Is there a sure way of determine the absolute minimum entropy encoding for a system? If not, is there maybe an algorithm that searches for it? Is this a solved problem?

Thank you!

$\endgroup$

1 Answer 1

1
$\begingroup$

It's not possible to compress to less than the entropy of the source. See https://en.wikipedia.org/wiki/Shannon%27s_source_coding_theorem.

It is possible that you are not measuring the entropy correctly. The entropy depends on the distribution of the source of data. If you calculate entropy based on an assumption that the input data is uniform, but then your data is actually non-uniform, then it will be possible to compress your actual data to less than you would predict based on a uniform assumption.

You haven't said how you computed the entropy, so it's not possible to give a more specific diagnosis, but we already know it is not possible (it would be like someone claiming to have invented a perpetual motion machine).

$\endgroup$

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.