|
Back to the table of contents Previous Next waffles_plotA command-line tool for plotting and visualizing datasets. Here's the usage information:
Full Usage Information
[Square brackets] are used to indicate required arguments.
<Angled brackets> are used to indicate optional arguments.
waffles_plot [command]
Visualize data, plot functions, make charts, etc.
bar [dataset] <options>
Make a bar chart using one row of data from the specified dataset. Prints
the chart as an SVG file to stdout.
[dataset]
The filename of a dataset for the bar chart. The dataset should
contain only continuous attributes. It only needs to have one row
since the other rows are ignored.
<options>
-range [min] [max]
Specify the min and max values to show on the chart. (The default
is to compute the range automatically.)
-row [n]
Specify which row in the dataset to use. (The default is 0.)
-pad [d]
Specify how much range to show beyond the min and max values when
the range is determined automatically. This value is ignored if the
range is specified explicitly.
-thickness [t]
Specify the thickness of the bars. (Note that the whole chart will
be stretched to fit the width, so adjusting the width may also
affect bar thickness.)
-spacing [s]
Specify how much space to place between the bars.
-textsize [d]
Specify the size of the font to use for the text labels.
-marks [n]
Specify the maximum number of horizontal lines to use to mark
positions on the vertical axis. (Set to 0 if you do not want any
markings.)
-size [width] [height]
Specify the size of the chart. (The default is 960 540.)
-labels [l0] [l1] [l2] [etc]
Specify label strings to use instead of the attribute names. The
number of labels specified should match the number of columns in
the data.
equation <options> [equations]
Plot an equation (or multiple equations) in 2D. Output is printed to
stdout as an SVG file.
<options>
-size [width] [height]
Specify the size of the chart. (The default is 960 540.)
-margin [size]
Specify the size of the margin for the axis labels. (The default is
100.)
-horizmarks [n]
Specify the maximum number of vertical lines to draw to mark
position along the horizontal axis.
-vertmarks [n]
Specify the maximum number of horizontal lines to draw to mark
position along the vertical axis.
-range [xmin] [ymin] [xmax] [ymax]
Set the range. (The default is: -10 -5 10 5.)
-nohmarks
Do not draw any vertical lines to mark position on the horizontal
axis.
-novmarks
Do not draw any horizontal lines to mark position on the vertical
axis.
-notext
Do not draw any text labels.
-nogrid
Do not draw any horizontal or vertical grid lines.
-aspect
Adjust the range to preserve the aspect ratio. In other words, make
sure that both axes visually have the same scale.
-thickness [size]
Specify the thickness of the lines.
[equations]
A set of equations separated by semicolons. Since '^' is a special
character for many shells, it's usually a good idea to put your
equations inside quotation marks. Here are some examples:
"f1(x)=3*x+2"
"f1(x)=(g(x)+1)/g(x); g(x)=sqrt(x)+pi"
"h(bob)=bob^2;f1(x)=3+bar(x,5)*h(x)-(x/foo);bar(a,b)=a*b-b;foo=3.2"
Only functions that begin with 'f' followed by a number will be
plotted, starting with 'f1', and it will stop when the next number in
ascending order is not defined. You may define any number of helper
functions or constants with any name you like. Built in constants
include: e, and pi. Built in functions include: +, -, *, /, %, ^, abs,
acos, acosh, asin, asinh, atan, atanh, ceil, cos, cosh, erf, floor,
gamma, lgamma, log, max, min, normal, sin, sinh, sqrt, tan, and tanh.
These generally have the same meaning as in C, except '^' means
exponent, "gamma" is the gamma function, "normal" is the standard
normal pdf, and max and min can support any number (>=1) of
parameters. (Some of these functions may not not be available on
Windows, but most of them are.) You can override any built in
constants or functions with your own variables or functions, so you
don't need to worry too much about name collisions. Variables must
begin with an alphabet character or an underscore. Multiplication is
never implicit, so you must use a '*' character to multiply.
Whitespace is ignored.
graph
Opens an interactive graphing tool in the web browser
histogram [dataset] <options>
Make a histogram. Print the plot to stdout in SVG format.
[dataset]
The filename of a dataset for the histogram.
<options>
-size [width] [height]
Specify the size of the chart. (The default is 1024 1024.)
-attr [index]
Specify which attribute is charted. (The default is 0.)
-range [xmin] [xmax] [ymax]
Specify the range of the histogram plot. (Note that ymin is always
0.)
printdecisiontree [model-file] <dataset> <data_opts>
Print a textual representation of a decision tree to stdout.
[model-file]
The filename of a trained decision tree model. (You can make one with
the command "waffles_learn train [dataset] decisiontree >
[filename]".)
<dataset>
An optional filename of the arff file that was used to train the
decision tree. The data in this file is ignored, but the meta-data
will be used to make the printed model richer.
<data_opts>
-labels [attr_list]
Specify which attributes to use as labels. (If not specified, the
default is to use the last attribute for the label.) [attr_list] is
a comma-separated list of zero-indexed columns. A hypen may be used
to specify a range of columns. A '*' preceding a value means to
index from the right instead of the left. For example, "0,2-5"
refers to columns 0, 2, 3, 4, and 5. "*0" refers to the last
column. "0-*1" refers to all but the last column.
-ignore [attr_list]
Specify attributes to ignore. [attr_list] is a comma-separated list
of zero-indexed columns. A hypen may be used to specify a range of
columns. A '*' preceding a value means to index from the right
instead of the left. For example, "0,2-5" refers to columns 0, 2,
3, 4, and 5. "*0" refers to the last column. "0-*1" refers to all
but the last column.
printrandomforest [model-file] <dataset> <data_opts>
Print a textual representation of random forest to stdout.
[model-file]
The filename of a trained random forest model. (You can make one with
the command "waffles_learn train [dataset] randomforest [trees] >
[filename]".)
<dataset>
An optional filename of the arff file that was used to train the
random forest. The data in this file is ignored, but the meta-data
will be used to make the printed model richer.
<data_opts>
-labels [attr_list]
Specify which attributes to use as labels. (If not specified, the
default is to use the last attribute for the label.) [attr_list] is
a comma-separated list of zero-indexed columns. A hypen may be used
to specify a range of columns. A '*' preceding a value means to
index from the right instead of the left. For example, "0,2-5"
refers to columns 0, 2, 3, 4, and 5. "*0" refers to the last
column. "0-*1" refers to all but the last column.
-ignore [attr_list]
Specify attributes to ignore. [attr_list] is a comma-separated list
of zero-indexed columns. A hypen may be used to specify a range of
columns. A '*' preceding a value means to index from the right
instead of the left. For example, "0,2-5" refers to columns 0, 2,
3, 4, and 5. "*0" refers to the last column. "0-*1" refers to all
but the last column.
scatter <globalopts> <color-data-x-y>
Makes a scatter plot or line graph. Print the resulting SVG file to
stdout.
<globalopts>
-size [width] [height]
Specify the size of the chart. (The default is 960 540.)
-margin [size]
Specify the size of the margin for the axis labels. (The default is
100.)
-horizmarks [n]
Specify the maximum number of vertical lines to draw to mark
position along the horizontal axis.
-vertmarks [n]
Specify the maximum number of horizontal lines to draw to mark
position along the vertical axis.
-pad [n]
Specify the ratio of extra space to include in the range of the
chart beyond the most-extreme points. (This value is only used if
the range is auto-determined. It is ignored if the range is
specified explicitly.)
-range [xmin] [ymin] [xmax] [ymax]
Set the range for the chart. (The default is to determine the range
automatically.)
-logx
Show the horizontal axis on a logarithmic scale
-logy
Show the vertical axis on a logarithmic scale
-nohmarks
Do not draw any vertical lines to mark position on the horizontal
axis.
-novmarks
Do not draw any horizontal lines to mark position on the vertical
axis.
-nogrid
Do not draw any horizontal or vertical grid lines.
-noserifs
Use a font with no serifs. (This generally makes charts look a
little cleaner.)
-hlabel [string]
Specify a label for the horizontal axis. (The default is to
determine it from the data.)
-vlabel [string]
Specify a label for the vertical axis. (The default is to determine
it from the data.)
-aspect
Adjust the range to preserve the aspect ratio. In other words, make
sure that both axes visually have the same scale.
<color-data-x-y>
[color] [dataset] [attr-x] [attr-y] <options>
[color]
Specify the color to use for this pair of attributes.
row
Use a spectrum color according to the row-index in the data
(starting with red, ending with purple)
#800000
Red.
red
The same as #800000.
pink
The same as #ffc0c0.
peach
The same as #ffc080.
orange
The same as #ff8000.
brown
The same as #a06000.
yellow
The same as #d0d000.
green
The same as #008000.
cyan
The same as #008080.
blue
The same as #000080.
purple
The same as #8000ff.
magenta
The same as #800080.
black
The same as #000000.
gray
The same as #808080.
[dataset]
The filename of a dataset containing the data you want to plot.
(Note that you will need to specify the dataset for each color,
even if they all come from the same dataset.)
[attr-x]
The zero-based index of the attribute to use to specify position
on the horizontal axis. (Alternatively, the special value "row"
may be used to use the row-index instead of an attribute for the
horizontal axis.)
[attr-y]
The zero-based index of the attribute to use to specify position
on the vertical axis. (Alternatively, the special value "row"
may be used to use the row-index instead of an attribute for the
vertical axis.)
<options>
-radius
Specify the radius (in window units) to use for each point.
-thickness
Specify the thickness (in window units) of the lines to use
to connect the points. (Use 0 if you want a scatter plot with
no connecting lines.)
percentsame [dataset1] [dataset2]
Given two data files, counts the number of identical values in the same
place in each dataset. Prints as a percent for each column. The data
files must have the same number and type of attributes as well as the
same number of rows.
semanticmap [model-file] [dataset] <options>
Write a svg file representing a semantic map for the given
self-organizing map processing the given dataset. For each node n, a
semantic map plots, at n's location in the map, one attribute (usually a
class label) of the entry of the input data to which n responds most
strongly.
[model-file]
The self-organizing map output from "waffles_transform som".
[dataset]
Data for the semantic map in .arff format. Any attributes over the
number needed for input to the self-organizing map are ignored in
determining som node responses.
<options>
-out [filename]
Write the svg file to filename. The default is "semantic_map.svg".
-labels [column]
Use the attribute column for labeling. Column is a zero-based
index into the attributes. The default is to use the last column.
-variance
Label the nodes with the variance of the label column values for
their winning dataset entries. If the label column is a variable
being predicted, then its variance its related to the predictive
power of that node. Higher variance means lower predictive power.
stats [dataset] <options>
Prints some basic stats about the dataset to stdout.
[dataset]
The filename of a dataset.
<options>
-all
Print stats for all attributes, even if there are a lot of them.
calcerror [dataset] <options> <col1-col2>
Prints an error metric between two columns.
[dataset]
The filename of a dataset.
<options>
-m [metric]
The error metric to use.
SSE
Sum-squared error metric.
MAPE
Mean absolute percent error metric.
<col1-col2>
[col1] [col2]
[col1]
The zero-based index of the attribute to use as the actual
value.
[col2]
The zero-based index of the attribute to use as the predicted
value.
usage
Print usage information.
Previous Next Back to the table of contents |