Iterator Tool

The Iterator Tool enables you to:

Launch multiple simulations,
Sweep parameters over specified ranges and step sizes for recursions, or,
Vary parameters randomly over specified ranges for Monte Carlo simulation,
Change parameters without re-building/re-compiling.
Aggregate the results of multiple simulations for listing and/or plotting.
Plot histograms.
Optimize control variables.
Launch and coordinate parallel simulations.

Usage:

Normally the Iterator tool is used with the ITERATOR_GUI as a front-end to this tool. However, the following describes the usage of the Iterator tool by itself, without the GUI, with plain files.

List your control attribute(s) in a file called: iteration.control in the following format:
```
		SWEEP:  parameter_name   initial_value  final_value  step_size
	
```
Or,
```
		RANDOMIZE:  parameter_name   min_range  max_range
	
```
(Bold indicates keywords.)
Multiple attributes may be varied by placing them in a file.
The parameters must be listed one per line.
They will vary together, unless separated by a line containing the keyword:
```
		SEQUENTIAL
	
```
Sequential recursions will multiply the number of cases run.
If a Sweep is used, then the number of recursions can be inferred from the range(s). If Randomize is used with no sweep-ranges, then you must specify a recursion count. Do this with a line containing:
```
		RECURSIONS:	count
	
```
The recursion count is optional when you have a step-size, in which case, the recursions stop when the first limit is reached.
Result data from each recursion adds a row to a conceptual results table. Each row may have multiple columns. After all simulation runs, the Iterator aggregates and/or plots the column(s) of results data. You may aggregate result columns by specifying their column number with the keyword:
```
		AGGREGATE	parameter_name	   column_number
	
```
The column numbers are assumed to start at 1. The mean, min, max, and standard deviation is produced for each aggregation. You can request multiple aggregations.
You may plot result columns by specifying their column numbers with the keyword:
```
		PLOT (x_col, y_col, color),  opt_title,  opt_x_label,  opt_y_label
	
```
You can list multiple x-y column pairs and colors, (x,y,c), for each plot. Note: The commas are significant. Colors may be red, blue, green, yellow, violet, orange, white, cyan, pink, fuchsia, aqua, navy, gold, light-gray, dark-gray, etc., or numbers 1-140, as defined in XGRAPH. Graph titles, and axes titles are optional, but recommended. You can request multiple plots. The Iterator will instruct you how to view each plot. Additionally, many other variations can be made to the plots after or while viewing. For example, plots may be easily combined or modified later.
Use the control attribute(s) in your models as a global attribute, macro, or variable. See: Attributes.
The result of each simulation run should save the relevant data result(s) to a file called, result.dat. You may need to modify your models to do this.
If you have multiple values, separate them by white-space. These become the columns of one row of the conceptual results table.
Invoke the Iterator on your control file:
```
		iterator  iteration.control
	
```
It will invoke your simulation iteratively, changing the control variables accordingly, while recording the results. After all runs have completed, it will aggregate and/or plot the results as you requested.

Example 1:

  SWEEP:  	velocity  0.0  750.0  50.0
  AGGREGATE:    2  Altitude
  PLOT:    	(1, 2, Red),  Flight Profile, Velocity (kph), Altitude (km)

Note-1: Your simulation model should be set-up to exit automatically at the end of each recursion. A simple way to do this based on a time, such as 500.0, is:

		DELAY( 500.0 );
		exit(0);

Note-2: You can embed comments within the Iterator script (file: iteration.control) using the squiggly brackets, ex. { comment }.

Note-3: You should build your simulation for textual command-line mode operation. No one will be there to press the Run button. To avoid needing to supply the run and quit commands from a script, you can invoke the simulator with the -batch command-line option. This causes the textual simulator to start running immediatly and then exit when there are no further events.

Note-4: During the recursions, there is normally much output to the screen. It may be difficult to glean the pertinent items. Fortunately Iterator produces a log file of the input parameter summary and the aggregated results. See iteration.log after running the Iterator to see a condensed summary of the conditions and results. It also shows how to view the graphical results. See iteration.log example for an example output.

Note-5: A GUI interface to Iterator exists, called ITERATOR-GUI. It guides you through generating the iteration.control file by pushing buttons, enables invocation of the Iterator by button press, and allows viewing of results through menus and buttons. Start the Iterator-GUI by clicking Tools/Iterator or by typing igui. See ITERATOR-GUI.

Advanced Options:

Automatic Design Optimization:

You can have the Iterator attempt to optimize your design over a set of parameters subject to your own goal criteria and constraints.

Modify your simulation model(s) to evaluate a given run to produce a performance metric or Measure Of Effectiveness value into a file called fom.dat by the end of each simulation. The goal will be to maximize the Figure Of Merit value. (This value is called a Figure of Merit (FOM), or Performance Figure of Merit (PFOM).)
Modify your iteration.control file to replace SWEEP keyword(s) with
```
		OPTIMIZE:	parameter_name   initial_value  min_range  max_range
	
```
Do this for each parameter to be optimized.
You must set the RECURSIONS count.

The min_range and max_range are constraints on the value of the optimized variable. You can set optimize on several variables to achieve simultaneous multi-variate optimization.

The optimizer is a steepest descent optimizer. It will run your simulation many times to explore the solution space. The quicker your simulation runs, the faster will be the optimization. The optimizer contains several optimization algorithms; each best suited for different system characteristics. For example, some systems exhibit a smooth continuously differentiable performance surface and are completely deterministic. Others have noise and require many observations to observe a trend. Still others are non-differentiable and have disjoint performance functions. Remember that the optimizer knows nothing about your system, other than it's response to control input via your PFOM; I.E. your system is a black box to the optimizer. The optimizer will characterize your system and switch to the best suited algorithm. The number of variables will also affect the convergence rate.

Designing a proper PFOM evaluation function is crucial. It should reflect your design goals, constraints, and relative trade-offs for your system. It may take some experimentation, based on feedback from results of the optimizer, to variations in your PFOM function. One hint is that if you have linear performance trade-off surfaces, it is usually useful to use sum of squares PFOM functions (I.E. Least-Squares optimization). This avoids infinite equivalent, and often trivial, solutions by minimizing the importance of a criterion as it gets close to satisfaction, and balances focus of attention on other criteria that may be further from their desired values.

In Optimize mode, the Iterator also produces an optimization log file called, optimization.log. You can plot the data in it to see the history of the optimizer's convergence. The first column is the recursion-number. The second column is the PFOM value that resulted at that recursion. The remaining column(s) is (are) the value(s) of the control variable(s) at each recursion. For example, you can plot the PFOM versus recursion-number by: xgraph -pl optimization.log.

If you run the optimizer in high verbosity mode (-v on the command-line), your will get much more detailed comments about the optimization, though extra text renders the the log file unplottable.

Reviewing Previously Captured Data:

When the Iterator runs, it produces a results table called: results_table.dat You can re-direct the Iterator to re-aggregate or plot a previously captured results_table.dat file by including only PLOT and AGGREGATE commands in your iteration.control file.

Plotting Histograms:

To plot histograms, use the HIST keyword in place of the PLOT keyword. A histogram is a frequency versus quantity graph.

The relevant parameters are:

results table column
histogram range (min,max)
number of histogram bins over the range.

The format is: HIST column min max nbins, opt_title, opt_x_label, opt_y_label
Example:

	HIST  2,  0.0,  250.0,  50,  Histogram

Note that you will get better averaging with fewer bins, because there will be higher counts within each bin. However, the histgram resolution will drop. Generally the number of histogram bins should be much less than the number of cases run, but it depends on the data and what you are looking to see.

Random Seed:

By default, for repeatability of random parameter sequences, Iterator will generate the same sequence each time Iterator is invoked. In other words, it initializes it's random number seed to the same default value. However, you can alter the random seed to generate new sequences by setting a different seed value. Use the SEED keyword, as in:

SEED: seed_value

The seed_value must be an integer. (Relatively prime integers tend to generate better random sequences.)

Parameter Files for Explicit Parameter Settings:

If you need to evaluate your models at specific parameter settings which are not conveniently described by sweep-loops, you can specify the explicit values in a parameter file and use the PARAMETER_FILE: command in your iteration.control file. The parameter file should have attribute name-value pairs with the Run keyword separating each run.
For Example, in your iteration.control file, have the line:
Parameter_File: testruns.params
(Instead of Sweep, Randomize, Optimize, etc..)

Then in file testruns.params, have something like:
Velocity = 75.2
Altitude = 10,000
Run
Velocity = 62.9
Altitude = 8,753
Run
Velocity = 109.4
Altitude = 30,801
Run

You can set as many parameters as you wish for each run. The iterator will perform the runs with your specified values.

Alternate Execution Commands:

By default, the assumed execution command is:

	sim.exe << || \n run \n quit \n ||

( The << || \n run \n quit \n || directs the the run and quit commands to start and exit the simulation, as input to the simulation prompt. The \n specify carriage returns.)

However an alternate method is to invoke your simulation with the -batch option as mentions in Note-3 above. The execution command then becomes:

   sim.exe -batch

Also, you may add command-line options, or specify other commands, by the RUN: keyword.
Example:

    RUN:  sim.exe  app.dfg  netinfo << || \n run \n quit \n ||

If you want to have additional commands run prior and/or after simulation, a clever idea is to wrap the simulation invocation into a script file that also calls the other files. Then have Iterator call your script file via the RUN: keyword. This option actually makes the Iterator general purpose. You can use Iterator to control any program, tool, or utility; not just CSIM simulations!

To use on non-CSIM programs, you should understand a little about the inner workings of Iterator. Iterator works by setting the parameter values in a file called top_tab.dat. Specifically, it looks for, and modifies lines of the following format:

	number <parameter_name> <value>

Example:

	1 <velocity> <150.0>

(The first number is not important to Iterator, but can be used to uniquely enumerate or identify your variables.)

If you structure your program to get it's variable values from this format, then you can use Iterator on any program.

Parallel Simulations:

An advanced option of the Iterator is the ability to launch parallel simulations on multiple computers and to coordinate the collection of their results. For example, if you have five computers on a shared file system, and you need to run a simulation 100 times while sweeping or randomizing parameters, then Iterator can automatically launch parallel simulations on the five computers such that, each does a unique iteration, results are collected and sorted appropriately, and all computers stay busy until no more iterations are left to run. In this example, each computer would run approximately 20 simulations, and would complete in about 1/5th the time required to run all the iterations on a single computer.

Parallel mode is activated by suppling the Hosts: option with a list of available computers to use. Iterator then launches simulations on the computers, as available by:

Starts simulation script on remote machine via "rsh". The simulation script copies the needed model/data files to a unique temporary directory (on the remote machine's local /tmp disk), cd's to the temporary directory, and invokes the simulation there. The script is setup to copy the result.dat file back to a unique file-name on the shared file-system on completion.
Iterator continues starting remote simulations until either: (A.) host list is exhausted, At which time it begins polling for the completion files. Or (B.) no more iterations needed.
When Iterator sees a completed result.dat file, then it is able to launch a new job on that host (repeat from step 1 above).
And so on, ... until all iterations are completed.
On completion of all iterations, the Iterator combines all result.dat files in order, and processes them as normal.

Parallel execution requires that all needed simulation files be tar'd into a tar-file whose name is supplied with the SimPackage: option. General-purpose simulations do not need a sim-package if the simulation does not require any local files and the run-command is specified with full absolute path. Otherwise, the SimPackage tar-file can be created for example for core-models by:
tar cf simpackage.tar sim.exe *.prog netinfo netinfo.rte
This provides you an opportunity to include whatever local files your simulation needs, such as other data, scripts, program files, or subdirectories.

You can also supply a build command with the Build: option. A typical build command is: csim -nongraphical mymodel.sim
This tells the Iterator the name of the simulation model to use, and provides the opportunity to supply additional, or arbritary, build options or Make-commands. Other commands could include running the DFG Scheduler or Router, but normally their output files would be copied within the tar-file. (Remember to supply either the full absolute paths to the source files, or pack all source files in the SimPackage, because the build will occur in a temporary directory.)

The Iterator will generate launch scripts of the form:

mkdir /tmp/runzzzzxxxx
cd /tmp/runzzzzxxxx
tar xf {path}/simpackage.tar
run_command
mv -f result.dat {path}/resultxxxx.dat
cd /tmp
rm -r -f /tmp/runzzzzxxxx

... Where {path} is the absolute path on the shared file-system where the Iterator is being run from, and xxxx is the iteration number. The zzzz is a unique alpha-numeric string to further ensure unique temporary directory names. Iterator generates the {path}, zzzz, and xxxx strings automatically. Note that Iterator is smart about the SimPackage names, in that it knows to not prepend the pwd if an absolute path was given, and it knows to first gunzip in the case of .tgz or .tar.gz suffix. Note also that Iterator cleans up after itself, by removing the temporary files from the remote system after sending results. Note that "rsh" capability must be available on the host systems to enable remote job invocations. If not, see note below on using ssh. The launch scripts will be launched by: rsh {host} {path}launchxxx.com &.

For many cases where parallel simulation is useful, the simulation build-time is much less than the individual run-times, so it is sometimes safer or simpler to copy and re-build the simulation on each machine. However, re-building, or even copying the model/data, are not necessary if appropriate options are used for the run commands. Specifically, the run command must reference the absolute path of the pre-compiled simulation, unless you are doing alternative commands, such as tar'ing a local executable or sourcing a local script.

It is important that your run-command not end with an ampersand (&). Otherwise, the iteration coordination and clean-up would begin before the simulation finishes. You may want to direct std.out to /dev/null or a log file to avoid the mixture of confusing output that could come back to your launching window.

Below is an example iteration.control file for parallel executions.

Example:

	SWEEP:        velocity  0.0  750.0  50.0
	AGGREGATE:    2  Altitude
	PLOT:         (1, 2, Red),  Flight Profile, Velocity (kph), Altitude (km)
	SimPackage:  simpackage.tar.gz
	HOSTS:	     pc1, pc2, pc3, pc4, pc5
	RUN:	     sim.exe -batch > /dev/null

While the iterations proceed, the Iterator displays the overall progress.

Note that optimization mode is not supported for parallel simulation, due to the inherently sequential evaluation/optimization process.

Note that RSH has been depricated on many systems due to security issues and may be turned-off at your site. Instead, use SSH. Add USE_SSH to your iteration.control file. SSH also requires setting up authentication keys to avoid interactive prompting for passwords. Use the following process to generate and install the private and public parts of your authentication keys:

Run:
ssh-keygen -t rsa
This creates:
$HOME/.ssh/id_rsa - Your private key(s)
$HOME/.ssh/id_rsa.pub - Your public key(s)
Copy $HOME/.ssh/id_rsa.pub to $HOME/.ssh/authorized_keys
cp $HOME/.ssh/id_rsa.pub $HOME/.ssh/authorized_keys
(Or move onto remote machine if not on shared file system.)

Command-Line Options:

The following command-line options can be used with Iterator. From the ITERATOR_GUI, these can be set on the Run Controls menu before invoking the Iterator.

-E - This option is intended for use with performance models from the Core-Models library which produce EventHist.dat files. This option saves the EventHist.dat file from each iteration by re-naming them to EventHist_X.dat, where X is the iteration number.
-S - This option is intended for use with performance models from the Core-Models library which produce *.dat statistics files. This option saves the *.dat files from each iteration by re-naming them to *_X.dat, where X is the iteration number. The following files are saved: EventHist_X.dat, IQtrace_X.dat, LatencyTrace_X.dat, LinkTline_X.dat, Mtrace_X.dat, OQtrace_X.dat, PEQtrace_X.dat, ProcTline_X.dat, Spider_X.data, summaries_X.dat and top_tab_X.dat.
-icount - This option adds the iteration count to the first column of the "results.dat" file. It is useful for plotting results versus the iteration.