Back to the table of contents

Previous      Next

waffles_plot

A command-line tool for plotting and visualizing datasets. Here's the usage information:

Full Usage Information
[Square brackets] are used to indicate required arguments.
<Angled brackets> are used to indicate optional arguments.

waffles_plot [command]
   Visualize data, plot functions, make charts, etc.
   bar [dataset] <options>
      Make a bar chart using one row of data from the specified dataset. Prints
      the chart as an SVG file to stdout.
      [dataset]
         The filename of a dataset for the bar chart. The dataset should
         contain only continuous attributes. It only needs to have one row
         since the other rows are ignored.
      <options>
         -range [min] [max]
            Specify the min and max values to show on the chart. (The default
            is to compute the range automatically.)
         -row [n]
            Specify which row in the dataset to use. (The default is 0.)
         -pad [d]
            Specify how much range to show beyond the min and max values when
            the range is determined automatically. This value is ignored if the
            range is specified explicitly.
         -thickness [t]
            Specify the thickness of the bars. (Note that the whole chart will
            be stretched to fit the width, so adjusting the width may also
            affect bar thickness.)
         -spacing [s]
            Specify how much space to place between the bars.
         -textsize [d]
            Specify the size of the font to use for the text labels.
         -marks [n]
            Specify the maximum number of horizontal lines to use to mark
            positions on the vertical axis. (Set to 0 if you do not want any
            markings.)
         -size [width] [height]
            Specify the size of the chart. (The default is 960 540.)
         -labels [l0] [l1] [l2] [etc]
            Specify label strings to use instead of the attribute names. The
            number of labels specified should match the number of columns in
            the data.
   equation <options> [equations]
      Plot an equation (or multiple equations) in 2D. Output is printed to
      stdout as an SVG file.
      <options>
         -size [width] [height]
            Specify the size of the chart. (The default is 960 540.)
         -margin [size]
            Specify the size of the margin for the axis labels. (The default is
            100.)
         -horizmarks [n]
            Specify the maximum number of vertical lines to draw to mark
            position along the horizontal axis.
         -vertmarks [n]
            Specify the maximum number of horizontal lines to draw to mark
            position along the vertical axis.
         -range [xmin] [ymin] [xmax] [ymax]
            Set the range. (The default is: -10 -5 10 5.)
         -nohmarks
            Do not draw any vertical lines to mark position on the horizontal
            axis.
         -novmarks
            Do not draw any horizontal lines to mark position on the vertical
            axis.
         -notext
            Do not draw any text labels.
         -nogrid
            Do not draw any horizontal or vertical grid lines.
         -aspect
            Adjust the range to preserve the aspect ratio. In other words, make
            sure that both axes visually have the same scale.
         -thickness [size]
            Specify the thickness of the lines.
      [equations]
         A set of equations separated by semicolons. Since '^' is a special
         character for many shells, it's usually a good idea to put your
         equations inside quotation marks. Here are some examples:
     
         "f1(x)=3*x+2"
"f1(x)=(g(x)+1)/g(x); g(x)=sqrt(x)+pi"
         "h(bob)=bob^2;f1(x)=3+bar(x,5)*h(x)-(x/foo);bar(a,b)=a*b-b;foo=3.2"
  
         Only functions that begin with 'f' followed by a number will be
         plotted, starting with 'f1', and it will stop when the next number in
         ascending order is not defined. You may define any number of helper
         functions or constants with any name you like. Built in constants
         include: e, and pi. Built in functions include: +, -, *, /, %, ^, abs,
         acos, acosh, asin, asinh, atan, atanh, ceil, cos, cosh, erf, floor,
         gamma, lgamma, log, max, min, normal, sin, sinh, sqrt, tan, and tanh.
         These generally have the same meaning as in C, except '^' means
         exponent, "gamma" is the gamma function, "normal" is the standard
         normal pdf, and max and min can support any number (>=1) of
         parameters. (Some of these functions may not not be available on
         Windows, but most of them are.) You can override any built in
         constants or functions with your own variables or functions, so you
         don't need to worry too much about name collisions. Variables must
         begin with an alphabet character or an underscore. Multiplication is
         never implicit, so you must use a '*' character to multiply.
         Whitespace is ignored.
   graph
      Opens an interactive graphing tool in the web browser
   histogram [dataset] <options>
      Make a histogram. Print the plot to stdout in SVG format.
      [dataset]
         The filename of a dataset for the histogram.
      <options>
         -size [width] [height]
            Specify the size of the chart. (The default is 1024 1024.)
         -attr [index]
            Specify which attribute is charted. (The default is 0.)
         -range [xmin] [xmax] [ymax]
            Specify the range of the histogram plot. (Note that ymin is always
            0.)
   printdecisiontree [model-file] <dataset> <data_opts>
      Print a textual representation of a decision tree to stdout.
      [model-file]
         The filename of a trained decision tree model. (You can make one with
         the command "waffles_learn train [dataset] decisiontree >
         [filename]".)
      <dataset>
         An optional filename of the arff file that was used to train the
         decision tree. The data in this file is ignored, but the meta-data
         will be used to make the printed model richer.
      <data_opts>
         -labels [attr_list]
            Specify which attributes to use as labels. (If not specified, the
            default is to use the last attribute for the label.) [attr_list] is
            a comma-separated list of zero-indexed columns. A hypen may be used
            to specify a range of columns.  A '*' preceding a value means to
            index from the right instead of the left. For example, "0,2-5"
            refers to columns 0, 2, 3, 4, and 5. "*0" refers to the last
            column. "0-*1" refers to all but the last column.
         -ignore [attr_list]
            Specify attributes to ignore. [attr_list] is a comma-separated list
            of zero-indexed columns. A hypen may be used to specify a range of
            columns. A '*' preceding a value means to index from the right
            instead of the left. For example, "0,2-5" refers to columns 0, 2,
            3, 4, and 5. "*0" refers to the last column. "0-*1" refers to all
            but the last column.
   printrandomforest [model-file] <dataset> <data_opts>
      Print a textual representation of random forest to stdout.
      [model-file]
         The filename of a trained random forest model. (You can make one with
         the command "waffles_learn train [dataset] randomforest [trees] >
         [filename]".)
      <dataset>
         An optional filename of the arff file that was used to train the
         random forest. The data in this file is ignored, but the meta-data
         will be used to make the printed model richer.
      <data_opts>
         -labels [attr_list]
            Specify which attributes to use as labels. (If not specified, the
            default is to use the last attribute for the label.) [attr_list] is
            a comma-separated list of zero-indexed columns. A hypen may be used
            to specify a range of columns.  A '*' preceding a value means to
            index from the right instead of the left. For example, "0,2-5"
            refers to columns 0, 2, 3, 4, and 5. "*0" refers to the last
            column. "0-*1" refers to all but the last column.
         -ignore [attr_list]
            Specify attributes to ignore. [attr_list] is a comma-separated list
            of zero-indexed columns. A hypen may be used to specify a range of
            columns.  A '*' preceding a value means to index from the right
            instead of the left. For example, "0,2-5" refers to columns 0, 2,
            3, 4, and 5. "*0" refers to the last column. "0-*1" refers to all
            but the last column.
   scatter <globalopts> <color-data-x-y>
      Makes a scatter plot or line graph. Print the resulting SVG file to
      stdout.
      <globalopts>
         -size [width] [height]
            Specify the size of the chart. (The default is 960 540.)
         -margin [size]
            Specify the size of the margin for the axis labels. (The default is
            100.)
         -horizmarks [n]
            Specify the maximum number of vertical lines to draw to mark
            position along the horizontal axis.
         -vertmarks [n]
            Specify the maximum number of horizontal lines to draw to mark
            position along the vertical axis.
         -pad [n]
            Specify the ratio of extra space to include in the range of the
            chart beyond the most-extreme points. (This value is only used if
            the range is auto-determined. It is ignored if the range is
            specified explicitly.)
         -range [xmin] [ymin] [xmax] [ymax]
            Set the range for the chart. (The default is to determine the range
            automatically.)
         -logx
            Show the horizontal axis on a logarithmic scale
         -logy
            Show the vertical axis on a logarithmic scale
         -nohmarks
            Do not draw any vertical lines to mark position on the horizontal
            axis.
         -novmarks
            Do not draw any horizontal lines to mark position on the vertical
            axis.
         -nogrid
            Do not draw any horizontal or vertical grid lines.
         -noserifs
            Use a font with no serifs. (This generally makes charts look a
            little cleaner.)
         -hlabel [string]
            Specify a label for the horizontal axis. (The default is to
            determine it from the data.)
         -vlabel [string]
            Specify a label for the vertical axis. (The default is to determine
            it from the data.)
         -aspect
            Adjust the range to preserve the aspect ratio. In other words, make
            sure that both axes visually have the same scale.
      <color-data-x-y>
         [color] [dataset] [attr-x] [attr-y] <options>
            [color]
               Specify the color to use for this pair of attributes.
               row
                  Use a spectrum color according to the row-index in the data
                  (starting with red, ending with purple)
               #800000
                  Red.
               red
                  The same as #800000.
               pink
                  The same as #ffc0c0.
               peach
                  The same as #ffc080.
               orange
                  The same as #ff8000.
               brown
                  The same as #a06000.
               yellow
                  The same as #d0d000.
               green
                  The same as #008000.
               cyan
                  The same as #008080.
               blue
                  The same as #000080.
               purple
                  The same as #8000ff.
               magenta
                  The same as #800080.
               black
                  The same as #000000.
               gray
                  The same as #808080.
            [dataset]
               The filename of a dataset containing the data you want to plot.
               (Note that you will need to specify the dataset for each color,
               even if they all come from the same dataset.)
            [attr-x]
               The zero-based index of the attribute to use to specify position
               on the horizontal axis. (Alternatively, the special value "row"
               may be used to use the row-index instead of an attribute for the
               horizontal axis.)
            [attr-y]
               The zero-based index of the attribute to use to specify position
               on the vertical axis. (Alternatively, the special value "row"
               may be used to use the row-index instead of an attribute for the
               vertical axis.)
            <options>
               -radius
                  Specify the radius (in window units) to use for each point.
               -thickness
                  Specify the thickness (in window units) of the lines to use
                  to connect the points. (Use 0 if you want a scatter plot with
                  no connecting lines.)
   percentsame [dataset1] [dataset2]
      Given two data files, counts the number of identical values in the same
      place in each dataset.  Prints as a percent for each column.  The data
      files must have the same number and type of attributes as well as the
      same number of rows.
   semanticmap [model-file] [dataset] <options>
      Write a svg file representing a semantic map for the given
      self-organizing map processing the given dataset.  For each node n, a
      semantic map plots, at n's location in the map, one attribute (usually a
      class label) of the entry of the input data to which n responds most
      strongly.
      [model-file]
         The self-organizing map output from "waffles_transform som".
      [dataset]
         Data for the semantic map in .arff format.  Any attributes over the
         number needed for input to the self-organizing map are ignored in
         determining som node responses.
      <options>
         -out [filename]
            Write the svg file to filename.  The default is "semantic_map.svg".
         -labels [column]
            Use the attribute column for labeling.  Column is a zero-based
            index into the attributes.  The default is to use the last column.
         -variance
            Label the nodes with the variance of the label column values for
            their winning dataset entries. If the label column is a variable
            being predicted, then its variance its related to the predictive
            power of that node.  Higher variance means lower predictive power.
   stats [dataset] <options>
      Prints some basic stats about the dataset to stdout.
      [dataset]
         The filename of a dataset.
      <options>
         -all
            Print stats for all attributes, even if there are a lot of them.
   calcerror [dataset] <options> <col1-col2>
      Prints an error metric between two columns.
      [dataset]
         The filename of a dataset.
      <options>
         -m [metric]
            The error metric to use.
            SSE
               Sum-squared error metric.
            MAPE
               Mean absolute percent error metric.
      <col1-col2>
         [col1] [col2]
            [col1]
               The zero-based index of the attribute to use as the actual
               value.
            [col2]
               The zero-based index of the attribute to use as the predicted
               value.
   usage
      Print usage information.

Previous      Next

Back to the table of contents