Wednesday, March 18, 2015

[isktejea] Data in pieces

Given some data, encode it into a collection of small pieces such that all (or most) sufficiently large subsets of pieces can recover the entirety of original data.

Unlike in the traditional error correcting code problem, the pieces are not presented to the decoder with any ordering information.  The pieces could contain ordering information, e.g., a sequence number, within each piece, but that is the responsibility of the encoding algorithm. Sequence number may not be the most efficient way to do it.

The decoder also is not directly told how many total pieces the original data had.  Again, this could be encoded in each piece, but this might not be efficient.

Unlike traditional error correcting codes, there is no possibility that a piece has errors in it (flipped bits). Every recovered piece is pristine; the only difficulty is the lost pieces.  Therefore, we (broadly) only need to worry about erasures (missing pieces) and not errors (flipped bits).

Two possible error models: The missing pieces could be assumed selected uniformly randomly (so we would like high probability that the original data will be recovered), or an adversary could select the worst possible subset of pieces to withhold.

Vaguely think something involving graphs might work well.

No comments :