Back to the table of contents Previous Next Data formatsOne of your first tasks may be to get your data into a format that our tools can operate on. Converting to ARFF forrmatOur preferred format for data is ARFF. This format is basically just a text file of comma-separated values, with a little bit of extra meta-data to give meaning to the data. Here is a simple example of some data in ARFF format: @RELATION mydata @ATTRIBUTE age continuous @ATTRIBUTE gender {male,female} @ATTRIBUTE hair {blonde,red,brown,black,none} @ATTRIBUTE weight continuous @DATA 18, male, red, 152 84, male, none, 138 42, female, blonde, 168 48, male, black, 341 5, female, brown, 49 24, female, red, 140 If your data is not in ARFF format, do not despair. We can work with other formats too. The following command will convert a simple text file of comma-separated values to ARFF format by automatically determining the meta-data. waffles_transform import mydata.csv > mydata.arffIf your data is separated by tabs, instead of spaces, we can handle that too. waffles_transform import mydata.csv -tabs > mydata.arffor whitespace waffles_transform import mydata.csv -whitespace > mydata.arffor semicolons waffles_transform import mydata.csv -semicolon > mydata.arffetc. Octave (or Matlab) exampleSuppose you are familiar with Octave (or Matlab), but you want to use Waffles to do something with your data. Here's how you could do it. First, let's export your data, y, from Octave: save -ascii y.txt yNext, we'll convert it to ARFF format: waffles_transform import y.txt -whitespace > y.arffNow, use Waffles to do something with your data. There are many things you could do. Here is a random example: waffles_dimred breadthfirstunfolding y.arff 18 kdtree 2 -reps 20 > x.arffThen, we'll convert the results back to the Octave format: waffles_transform export x.arff -tab > x.txtFinally, go back into Octave and load your data: load x.txt Manipulating dataMaybe you'll need to tweak your dataset a little bit. We provide tools to drop columns, swap columns, fill in missing values, sort in a particular column, shuffle rows, and numerous other useful transformations. Here are a few examples. Hopefully, the command itself is sufficiently clear to describe what it does. waffles_transform dropcolumns diabetes.arff 0,2-5,7 waffles_transform swapcolumns mydata.arff 0 3 waffles_transform fillmissingvalues mydata.arff waffles_transform sortcolumn mydata.arff 2 waffles_transform shuffle mydata.arffThere are many other possible transformations that you can apply to your data. For a complete list, take a look at the usage information for the waffles_transform tool. Previous Next Back to the table of contents |