Friday, August 23, 2013

Advanced Topic: Fully Automated Linear Regression


In this Advanced Topic post, I discuss how you can both create data and run a statistical analysis all from within a single Myrtle script.  Teachers and course instructors may wish to do something like this when coming up with a class example for their lectures or even for generating problems and answer keys for their quizzes or exams.

Note: Requires Myrtle Version >= 1.8.13

Our goal will be to create a synthetic data set containing two variables that are linearly related according to the equation Y = 1.3 + 2*X, but also contaminated by random measurement error. We will want to not only generate the data, but also run a statistical analysis of the data.

Create a new blank procedure by clicking the new procedure button ("Create a new procedure.") on Myrtle's procedure toolbar -- it looks like a blank sheet of paper.


Next, right-click on the new procedure (Untitled) you just created and select Rename...  Rename the procedure to something more informative than Untitled like "AutoRegression" as shown in the above image.

Next, edit the procedure by double clicking on it as shown below.


Let's begin writing our script.  You will need to copy and paste (e.g. Ctrl+c and Ctrl+v) or simply type directly into the script editor the lines shown in red below.  First, we need to let the compiler know about some of the packages we will be using with a few import statements.

import com.mockturtlesolutions.snifflib.datatypes.DblMatrix;
import com.mockturtlesolutions.snifflib.stats.NormalDistribution;



Then, we create some linear data in order to mimic real data.  We'll assume for now that our data set has N = 10 observations.  The underlying linear relationship is Y = 1.3 + 2*X.  But, in order to add some realism to this "real" data, we will also perturb the Y-values with deviates from a normal distribution. We utilize DblMatrix class methods plus and times.

normdist = new NormalDistribution();
X = DblMatrix.span(0,10,10);
Y = X.times(2).plus(1.3);
deviates = normdist.random(X.getN());
Y = Y.plus(deviates);

Next, we will paste these "real" data into the current spreadsheet.
ParentPanel.pasteDblMatrixAt(X,0,0);
ParentPanel.pasteDblMatrixAt(Y,0,1);
Realize that when this script actually runs, the Myrtle function  pasteDblMatrixAt() will be pasting the X data into the first column (JAVA indices start at 0) at the first row.  Then, we assign some bookmarks to those spreadsheet data ranges.
ParentPanel.addBookmark("Xdata","Sheet1!A1:A10",true);
ParentPanel.addBookmark("Ydata","Sheet1!B1:B10",true);
Lastly, we load and run Myrtle's standard linear regression script on these data.

String proc = "com.mockturtlesolutions.LinearRegression";
Script script = ParentPanel.loadArchivedProcedure(proc);
 
Binding bind = script.getBinding();
bind.setVariable("XDATADefault","#Xdata");
bind.setVariable("YDATADefault","#Ydata");
 
script.run();
Be sure to save your edits to your AutoRegression script (Save or Ctrl+s).  Your session should now look like the following:


Finally, click on the "Run & update selected procedures" button  (has green arrow on it).  Running the script will now produce a detailed regression analysis.  Notice that the estimated slope an intercept are close, but not identical, to the "true" values in the underlying linear relationship.

Instructors may wish to experiment with different values for the sample size (N) and the magnitude of the random deviates to determine their effects on the resulting parameter estimate bias.


That's it!  But before you leave, however, you should consider archiving your AutoRegression script.  Why?  Well, if you think you ever might want to tweak or fine-tune this script or use it in the future (e.g. for generating exam or quiz problems) you should archive it.  To do this, right-click on the script's icon and select the Archive... option.  Edit the fields as you see fit and then finally click the upload button (cloud icon) as shown below.





For your convenience, the entire complete AutoRegression script listing mentioned above is reproduced below.


import com.mockturtlesolutions.snifflib.datatypes.DblMatrix;
import com.mockturtlesolutions.snifflib.stats.NormalDistribution;


////////////////////////////////////////////////////////////////
// First, we create some synthetic linear data...
////////////////////////////////////////////////////////////////


normdist = new NormalDistribution();
X = DblMatrix.span(0,10,10);
Y = X.times(2).plus(1.3);

deviates = normdist.random(X.getN());

Y = Y.plus(deviates);

////////////////////////////////////////////////////////////////
// Next, paste the data into the current spreadsheet.
////////////////////////////////////////////////////////////////

ParentPanel.pasteDblMatrixAt(X,0,0);
ParentPanel.pasteDblMatrixAt(Y,0,1);

////////////////////////////////////////////////////////////////
// Then, assign some bookmarks to the data ranges just created.
////////////////////////////////////////////////////////////////

ParentPanel.addBookmark("Xdata","Sheet1!A1:A10",true);
ParentPanel.addBookmark("Ydata","Sheet1!B1:B10",true);


////////////////////////////////////////////////////////////////
// Lastly, run Myrtle's standard linear regression script on
// these data.
////////////////////////////////////////////////////////////////

String proc = "com.mockturtlesolutions.LinearRegression";
Script script = ParentPanel.loadArchivedProcedure(proc);
Binding bind = script.getBinding();
bind.setVariable("XDATADefault","#Xdata");
bind.setVariable("YDATADefault","#Ydata");

script.run();

Secrets to Clicking Revealed...

A quick post today to expose something so essential about Myrtle you probably encounter it every time you use Myrtle. You may not have even noticed that it's happening. Specifically, I want to discuss how clicking works in Myrtle.

Myrtle's user interface was designed to be both simple and portable. That is, it should be simple to use and usable on a variety of different devices including mobile devices and tablets. To make browsing results and editing sheets and scripts easy on those kinds of devices that may not have a hand-held mouse, Myrtle uses 2-clicks. 

Unlike the standard double-click you may be used to, Myrtle's 2-click requires 2 separate clicks where the first click makes the item active. You will notice the active state of the item by it being highlighted. If clicked again, within a certain amount of time, the item will be edited, viewed, or whatever.

Controlling the 2-Click Sensitivity

 

If you click from Myrtle's main window to Help->Preferences you will see a preference called activecelltime. This activecelltime is the length of time in milliseconds (i.e. 1500=1.5 seconds) that a first click remains active before a second click will result in editing mode.

If, for example, you notice your spreadsheet cells are getting entered into edit mode more often that you'd want because of stray/unintended clicks, try increasing activecelltime. Conversely, if the action is too slow for your try decreasing activecelltime.Realize, that the same user on different devices may find different "sweet spots" for the value of activecelltime.
Hint: If you frequently use a number of different devices, you might consider creating a different preference set for each device you use.  Then in each of those preference sets, you can set the optimal activecelltime for that device.  Preferences can also be used to set the color, font sizes etc for your different devices.

Hold On, What About Standard Double-Click?

 

Don't despair.  If you are on a device that has a hand-held mouse and your system recognises standard double-click events you are in luck.  In Myrtle, a standard double-click is equivalent to a 2-click and will allow you to browse or edit just as easily.  Also, such systems typically have accessibility preferences that will allow you to control the sensitivity of the double-click in a manner similar to Myrtle's activecelltime.



Friday, August 16, 2013

Launching Myrtle's .jar File On Linux/Unix Systems

Windows users are lucky since the Myrtle .exe file will launch Myrtle from a desktop or a flash drive with no problems.  However, if you happen to be one of those individuals running a Linux/Unix-based OS like Ubuntu, RedHat, Fedora, CentOS, etc. you may have had some difficulty.  You may already know that you can launch the Myrtle spreadsheet executable jar file from a terminal (i.e. xterm) or shell using a command like:
java -jar Myrtle-1.8.11.jar
However, you may have encountered difficulty running Myrtle from within a graphical file browser like Nautilus. This is useful if you want to quickly run Myrtle off of a USB flash drive for example and don't want to bother opening an xterm.  Unfortunately, the default Nautilus behavior is to treat a .jar file as a file archive and it will open and start browsing its contents for you...  Often not what you want to see.

For those on Linux/Unix running a Gnome-based desktop, try the following trick.  Copy the following file to your  ~/.local/share/applications folder where '~' represents your home directory path.

Save the following in a text file named execjar.desktop
[Desktop Entry]
Encoding=UTF-8
Type=Application
Exec=java -jar %f
#You may need to modify icon tag
#to suit your system.
Icon=java-1.7.0
Name=execjar
Comment=Run an executable jar file
From Nautilus you will now be able to "right-click" on the jar file (e.g. Myrtle-1.8.11.jar) that you want to run and then select Open With-->Other Application...  Select execjar from the list and Myrtle will launch. That's it!

Using the above method you are in fact able to launch any executable jar file when needed, but still be able to browse .jar file contents according to the default behavior of Nautilus when wanted.


Monday, August 12, 2013

Merging Bookmarks

In this post we'll see how to create, use, and merge bookmark references in Myrtle spreadsheet.  You may notice that I am using Myrtle's Crumbach theme, so don't be worried if your spreadsheet looks different from mine.



Begin by dragging your mouse over (selecting) the first 10 rows in column A. Next, right click (or Ctrl+f) to bring up the "Fill Dialog."  By default, Myrtle's fill will start at 1.0 and increment by 1 up to the number of selected cells.
 

Go ahead and click "Ok".   Notice that Myrtle has filled the range A1:A10 with an ascending sequence starting at 1 and ending at 10. Next select the first 10 rows of column B and then bring up the "Fill Dialog" again. This time, just to be different, will fill with a descending sequence of numbers.  Enter "10" for the "Start" and enter "1" for the "Stop."




Next, let's bookmark the ranges we have just created so we can refer to them by hashtag reference in a cell formula later on.  Select the range A1:A10 and bring up the "Bookmark Dialog" (or use Ctrl+r).  Give this range a name like "ValuesA" (or whatever other name you choose).




We'll do the same for the values in B1:B10.  I'll call mine "ValuesB", but you can call yours whatever you'd prefer.  Once you have done that, open the bookmark manager.  Click on the book icon (or use Ctrl+Shift+r).  You should see something like the following.




Let's get to merging!  Select the bookmarks you want to merge with your mouse.  Then right click (command+click on Mac's)  to bring up the pop-up menu.  Select "Merge".



Give the new merged range a nice name like "MergedData" and click the "Ok" button.






You should now see in your bookmark manager something like the following.



Notice that the merged range is actually a comma-separated list of cell ranges. Comma-separated lists of cell ranges can be long and a pain to type into a cell formula.  Instead, let's use a hashtag reference to the merged bookmark.  For this simple example let's suppose we want the sum of all the values in the two ranges.  Using our bookmark merge, we simply use the formula 


=sum(#MergedData)




You might remember from one of your math classes that the sum of integers from 1 up to N is N*(N+1)/2  This is commonly called Gauss' Rule.   For our example N=10, and we do so twice, so we should be able to check the result by 2*(10*(10+1)/2)=10*11=110.  This is indeed what Myrtle reports.


You will find that many of the functions in Myrtle can operate on merged references. These include functions like:

abs, acos, asin, atan, ceil, cos, exp, expm1, floor, getExponent, max, mean, median, min, sum, rint, round

Shifting and Moving Selections

One of the convenient features of Myrtle is the ability to move data selections where you want them.  In this post, we introduce the shift operations and their keyboard shortcuts.

First, let's create some data to play with.  Select the 3x3 region of cells starting at A1.


Next, fill the region.  To do this, bring up the table menu by right-clicking on the table (command+click on Mac's) and then selection "Fill" or by simply using the keyboard shortcut (Ctrl+f).


Select "Ok" and you should get the following...


Now let's see how easy it is to move this block of data around in Myrtle's spreadsheet.  To move the block one column to the right select "Shift Right" from the table menu or use the keyboard shortcut Ctrl+0.  On most keyboards the "0" key also is the right parenthesis key so this is a good way to remember this as the "Shift Right" operation.



Now let's shift the same block of data down.  Select "Shift Down" from the table menu or use the keyboard shortcut Ctrl+k.  You should now see your block of data has moved down one row.  


You may have noticed how Myrtle's shifting keyboard shortcuts are arranged like the cardinal directions found on a compass.


  N          i
W + E      9 + 0
  S          k

For example, to shift the block of data to the left select you may either select "Shift Left" from the table menu or simply use the keyboard shortcut Ctrl+9.


Finally, you can bring this block of data all the way back to its home by shifting it up (Ctrl+i).


Now be sure to practice these Myrtle's shifting operations until you feel comfortable using them.