pockmark - corrupt a data stream
Synopsis
Description
Options
Method
Examples
License
Copyright
Authors
pockmark [options...]
pockmark is used to corrupt a data stream, usually for testing error correction and other recovery codes. There are three modes of corruption: overwrite, insert, or delete. The first does not change the size of the data stream. Reads from stdin, writes to stdout.
pockmark may be obtained as part of the drm_tools package from: http://sourceforge.net/projects/drmtools/
-method <N> Corruption method applied: 0 overwrite, 1 insert, 2 delete [default 0]
-maxgap <N> Longest number of blocks between corruptions [range 1-2147483647, default 64000]
-maxrun <N> Longest run of blocks corrupted [range 1-2147483647, default 1024]
-mingap <N> Shortest number of blocks between corruptions [range 0-maxgap, default 1]
-minrun <N> Shortest run of blocks corrupted [range 0-maxrun, default 1]
-bs <N> Specify the blocksize in bytes. [Default is 1, act on bytes].
-fill <N> Integer inserted or overwritten at corrupted locations [range 0-255, default 0]
-table <N1>,<N2>,<N3>,...<NM> Table of M integers, one of which is inserted or overwritten at random at corrupted locations. May not be combined with -fill.
-seed <N> Large integer seed for the random number generator [default is time in seconds]
-text Treat the input as lines of text. CR and LF bytes are not modified. Only with -bs 1.
-safe SAFE In -text mode do not modify lines which start with any of the in the characters in the SAFE string,
-h -help --help -? --?? Print the help message. (Default - do not print help message.)
-hmethod Print an explanation of the method.
-i Emit version, copyright, license and contact information.( Default - do not emit information.)
Pockmark reads data as blocks from stdin and writes them to stdout. As it does so it alternates between two states called RUN and GAP.The parameters which affect this process are: bs block size (default value is 1 byte) maxrun the longest allowed RUN maxgap the longest allowed GAP fill a single substitution value table a table of substitution values
When it finishes a RUN it chooses a gap size which is bs * ( mingap + (rand() / (2147483647 / maxgap))) unless maxgap-mingap < 2, in which case it is bs * ( rand() > RAND_MAX/2 ? maxgap : mingap) When it finishes a GAP it chooses a run size which is bs * ( minrun + (rand() / (2147483647 / maxrun))) unless maxrun-minrun < 2, in which case it is bs * ( rand() > RAND_MAX/2 ? maxrun : minrun)
There are three corruption methods: Overwrite: Blocks in the RUN state are overwritten with either the single FILL value or values selected at random from a table for each byte. Blocks in the GAP are passed unchanged.
Insert: Run size blocks are inserted of either the single FILL value or a value selected at random from a table for each byte. Blocks in the GAP are passed unchanged.
Delete: Blocks in the RUN phase are deleted. Blocks in the GAP are passed unchanged.
% pockmark -h List the the command line options.
% cat file | pockmark -maxgap 1000 -maxrun 20 -bs 512 > file.pox Mimic corruption on a block oriented device like a CDROM. Make a corrupted version of the file with block size 512, runs of up to 20 corrupted blocks, and gaps between corrupted regions of up to 1000 blocks. Corrupted regions are filled with null bytes.
% cat file | pockmark -maxgap 1000 -maxrun 20 -bs 512 -fill 36 > file.pox Mimic corruption on a serial line or other byte oriented device. Make a corrupted version of the file with runs of up to 20 corrupted bytes, and gaps between corrupted regions of up to 1000 bytes. Corrupted positions are filled with percent characters.
% cat DNA.fasta | pockmark -text -safe ’>’ -maxgap 100 -maxrun 20 -table 65,67,71,84 > DNA_pox.fasta Make point mutations in a multiline DNA sequence file. The changes will be in runs of up to 20 bases with gaps between the changes of up to 100 bases. (The ASCII codes for the letters ACGT are provided in the -table parameter.) The length of the modified sequences will not be changed. Header lines in Fasta files start with ’>’ and these will also not be modified. Use multiple piped pockmark operations to also add insertions or deletions. Overwrite and delete runs will continue onto the next sequence, if any, if they are not fully consumed before the end of the line.
GNU General Public License 2
Copyright (C) 2015 David Mathog and Caltech.
David Mathog, Biology Division, Caltech <mathog@caltech.edu>
drm_tools | pockmark (1) | 1.0.5 Jul 01 2015 |