Probability Distribution Plot
See Also
To assess the how well empirical data approximates a particular theoretical distribution, the sorted values (order statistics, X(i)) are plotted against the expected values of the order statistics E(i) from the given distribution. However, usually the particular parameters of the distribution are not known and these have to be estimated first to obtain the expected values.

If the distribution has a cumulative density function of F(x), and the inverse of this function is G(x) (i.e. G(F(x)) = x), then the expected values of the order statistics, are approximately G((i-0.5)/n), where i = 1...n, and n is the number of values in the sample. A plot of X(i) vs E(i) is known as a Quantile-Quantile (or Q-Q) plot. The data can also be plotted on the probability scale by plotting the cumulative probabilities of the data under the assumed distribution against their expected probabilities, i.e. F(X(i)) vs (i-0.5)/n. This is known as a Probability-Probability (or P-P) plot.

A third plot called the stabilized probability (SP) plot (Michael, 1983), was introduced, which rescales the probabilities using the transformation sp = (2/pi)*arcsin(sqrt(p)), so that the variance of the plotted points are approximately equal over the range of probability values. In the SP plot the scaled values sp are plotted rather than the unscaled p values.

The following graph shows a Normal Q-Q plot with 95% simultaneous confidence bands and a 1-1 reference line.

Available Data

This lists variates that are available for analysis. Double-click on a name to copy it to the Data values field; alternatively, you can type in the name directly.

Data Values

This specifies the name of the variate that will be used in the probability distribution plot.

Distribution

This provides a drop down list of the range of continues distributions that the observed data can be plotted against.

Degrees of Freedom

Some of the distributions (Chi-square, t and F) cannot have the parameters estimated by the usual distribution fitting facilities, so these fields provide the degrees to specify the parameters of these distributions.

Box Cox Transform

If this is selected, a Box Cox transform will be performed on the data before plotting it. The Box Cox transform for a variate X is defined as:
Y = (X**lambda - 1)/lambda	if lambda is not equal to 0, and 
Y = LOG(X) 			if lambda = 0.
The power - lambda is specified in the field provided.

If X does not have a normal distribution, a value of lambda can often be found such that Y is normally distributed.

For a Normal distribution, the Estimate button will use the YTRANSFORM command to calculate the optimal value of lambda (to the nearest 0.1 between -4 and 4) to transform the X values to a Normal distribution. The optimal value of lambda should be placed in the field above, unless the server is busy with other calculations, in which case you will need to cut and paste the value of lambda from the Output window when the server has completed the calculation.

Plotting Scale

The graph can be plotted on three scales:

Confidence Bands

This drop down list allows two forms of confidence intervals to be displayed in the graph.

Action Buttons

RunRun the analysis.
CancelClose the menu without further changes.
OptionsOpens a dialog where additional options and settings can be specified for the analysis.
DefaultsSet the menu settings back to the default settings. Clicking the right mouse on this button produces a shortcut menu where you can choose to set the menu using the currently stored defaults or the GenStat default settings.
StoreOpens a dialog to specify names of structures to store the results from the analysis. The names to save the structures should be supplied before running the analysis.

See Also