Generate a Random Subset from a Spreadsheet
See Also
This menu can be used to create new spreadsheets based on a random subset/sample of rows from a spreadsheet.

Number of Samples

Provides a space to specify the number of random samples to be used. Alternatively, you can provide a percentage of the number of rows to be used by selecting the % option. Note that if Sample with Replacement option is selected, then the number of samples must be less than the number or rows in the spreadsheet (or 100 %).

Sample with Replacement

When selected, sampling with replacement will be used when forming the subset. That is at each random selection of a row, all the available rows are eligible for selection. If this option is not selected then only the rows that have not been previously selected are eligible for selection.

Weighting

If a column in the drop down list is selected, then the values in the selected column will be used in a weighted random sample. The default is the <Equal> setting, where all rows have equal chance of being selected. Rows with a weight value ≤ 0 will not be included in the random sample.

Seed

The Seed option is used to specify an integer value that will be used to start the randomization. If a value of * is given for the seed, a value from the computer's clock will be used.

Create Unique column names

When selected, columns in the spreadsheet will have new names generated for them so that they are unique, otherwise the columns will have the same names as the original spreadsheet.

Randomize Rows

When selected, rows in the resulting spreadsheet will be sorted into a random order.

OK

Generate a random subset into a new spreadsheet and close the dialog.

Cancel

Close the dialog without creating any new spreadsheets.

See Also

Split or Subset a Spreadsheet
Randomize Rows in a Spreadsheet
Duplicate a Spreadsheet
Spreadsheet Manipulate Menu
Spreadsheet Calculate Menu

The SUBSET procedure can be used in conjunction with the GRUNIFORM function of CALCULATE command in the command language to do sampling with or without replacement.
To sample without replacement P rows out of N, the following commands could be used:

FSORT [INDEX=GRUNIFORM(N;0;1)] !(1...N); Pos
SUBSET [Pos <= P] X,Y; Sample_X,Sample_Y

To sample with replacement P rows out of N, the following commands could be used:
CALC Row = INT(N*GRUNIFORM(P;0;1)) + 1
VARIATE Sample_X,Sample_Y; (X,Y)$[Row]