|
Program generator version 0.9.5
Program generator version 0.9.5
Initial revision 2003-01-19; Last revision 2004-05-31
1 Download
2 File readme
3 Usage and options summary
4 Description
5 Project revision history
6 License
1 Download
Sources: src/generator-0.9.5.tgz [31 Kb ]
Win9x-EXE (minGW cross-compiled): mingw/generator.zip [22 Kb ]
2 File readme
generator --- random text generator by given model text
SUPPORTED ENVIRONMENTS
http://www.gnu.org GNU/Linux
http://www.mingw.org MinGW --- Minimalist GNU For Windows
COMPILATION
Enter make (or gmake) in the directory where sources reside
BRIEF INSTRUCTION
This program can be used for the random text generator given model
text. The call
generator file.txt
will produce random text similar to file.txt to standard output
License conditions are described in file LICENSE.txt
3 Usage and options summary
user@computer$ ./generator --help
Usage: generator [OPTION]... FILE
-o, --order <num=2> the maximal order for the model
-s, --seed <num=1> seed the model with integer <num>
-c, --catch-eof <num=1> stop output on meeting EOF if 1; no stop if 0
-g, --generator <num=1> select random number generator 1..3
-b, --bytes <num> output <num> bytes (warning: sets -c0)
-k, --kbytes <num> output <num>*1024 bytes (warning: sets -c0)
-r, --randomize the seed is chosen using the current time
-n, --naive-sort use naive sort (decrease memory use but slower)
-q, --quiet do not send any messages to stderr
-h, --help display this help and exit
-m, --man display complete description
-v, --version display version and exit
4 Description
user@computer$ ./generator --man
<Usage information from the previous section is omitted>
This program generates random output using statistics from file
FILE1. It uses Markov model of order <order> defined by switcher
--order (and equal to 2 by default) to output next symbol from current
context of length <order>. If current context does not present in
FILE, then the length of context is decreased and program uses Markov
model of smaller order. Finally it arrives at order 2 and outputs a
randomly-chosen symbol from FILE. If --catch-eof=0 (-c0), then the
program would output -b bytes or -k kilobytes, or will never stop if
-b and -k options were not specified. If --catch-eof=1, then the
program stops as soon as it encounter the context at the end of FILE;
for <order>=0, the program stops with probability 1/(size(FILE)+1) on
each outputted symbol (this way you can produce outputs of size
comparable to size(FILE). The initial context is chosen at random.
You can define initial seed number for random number generator,
randomize it with current time by -r option (in this case the random
number used is outputted to STDERR). Three random number generators
from are available with option -g<num>. All of them were taken from
the book "Numerical Recipes in C, 2nd edition"
-g1 (default) "Minimal" random number generator of Park and Miller
with Bays-Durham shuffle and added safeguards.
-g2 Long period(>2E18) random number generator of L'Ecuyer with Durham
shuffle.
-g3 Knuth's random number generator using subtractive method
"Seminumerical algorithms", 2nd edition., vol. 2 of "The art of
computer programming", sections 3.2-3.3
We use Larsson-Sadakane sorting algorithm for suffix sort described in
"Faster Suffix Sorting" by N. Jesper Larsson (jesper@cs.lth.se) and
Kunihiko Sadakane (sada@is.s.u-tokyo.ac.jp). It requires 9*size(FILE)
memory. One can reduce memory requirements by switcher -n for naive
suffix sort using system qsort function. In the last case memory
requirements decay to 5*size(FILE), at cost of slowing by factor 4.
However, the system qsort may require a lot of memory, in particular,
in the stack which might lead to errors in sorting
5 Project revision history
Files of the project were modified on the following dates:
2003-01-19
2003-02-08
2003-05-16
2003-05-18
2003-08-27
2004-05-31
6 License
generator
Available at http://www.math.toronto.edu/dkhmelev/PROGS/tacu/
Author:
Dmitry V. Khmelev
dkhmelev((at))math.toronto.edu
[change ((at)) to @ in order to get proper address - antispam]
University of Toronto,
Department of Mathematics,
100 St George Street,
M5S 3G3 ON,
Canada
LICENSING TERMS
This program is granted free of charge for research and education
purposes. However you must obtain a license from the author to use it
for commercial purposes.
Scientific results produced using the software provided shall
acknowledge the use of generator. The proper reference is:
D. Khmelev, Text Analysis and Conversion Utilities
http://www.math.toronto.edu/dkhmelev/PROGS/tacu/
Moreover shall the author of generator be informed about the
publication.
The software must not be modified and distributed without prior
permission of the author.
By using generator you agree to the licensing terms.
NO WARRANTY
BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT
WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER
PARTIES PROVIDE THE PROGRAM ÄS IS" WITHOUT WARRANTY OF ANY KIND,
EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE
PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME
THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
REDISTRIBUTE THE PROGRAM, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF
THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO
LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY
OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED
OF THE POSSIBILITY OF SUCH DAMAGES.
1 Download
2 File readme
3 Usage and options summary
4 Description
5 Project revision history
6 License
|