| Author |
Message |
< ASReml ~ Handling missing values
|
When you have very large numbers of missing values in data set? what could you do to estimate heritability?
| Keep all records and estimate directly without data editting |
| [ 0 ] 0% |
| |
| Keep only animals with recorded data (remove all animals with missing values) |
| [ 0 ] 0% |
| |
| Editing data: keep animals with missing values if they have same years, section, pens... as recorded animals |
| [ 1 ] 100% |
| |
| Other options ???? |
| [ 0 ] 0% |
| |
Total Votes : 1 |
|
| ddo |
Posted: Thu Jul 26, 2012 10:49 am |
|
|
|
Joined: 18 May 2012
Posts: 12
Location: Copenhagen
|
Dear All
I have just small question concerning experiences in handling missing values. How much percentage of missing values allowed in bivariate analysis in ASreml???
I have tried to estimate heritability for Feed efficiency in bivariate analysis with ADG in two different data set. The full data containing approx 720 000 records but only 23 000 recording for FCR. The reduced data I have removed all animals. with missing values for FCR so It contains only 23 000 animals (attached files).
I got different results for heritability of FCR, in full data Heritability of FCR is 0.30 while in the reduced data is 0.26. I really confusing which one is better estimation. Someone told me that I could not use too many missing values in model. Someone said that I can pull in around 10% missing values (meaning to use around 30 000 records).
I have begun to learn about animal breeding, I hope to get some experiences for all of you.
Thank you |
| Description: |
| The reduced data with all animals have records for FCR |
|
 Download |
| Filename: |
FCR and ADG bivariate analysis with reduced data.asr |
| Filesize: |
5.82 KB |
| Downloaded: |
311 Time(s) |
| Description: |
| The full data with only 23000 animals of 715156 animals have records for FCR. |
|
 Download |
| Filename: |
Bivariate analysis of FCR and ADG in full data .asr |
| Filesize: |
5.75 KB |
| Downloaded: |
300 Time(s) |
|
|
| Back to top |
|
| Arthur |
Posted: Sat Jul 28, 2012 2:40 am |
|
|
|
Joined: 05 Aug 2008
Posts: 277
Location: Orange, NSW
|
The analysis of just 22360 records produces an analysis that relates to those 22360 records. The analysis of 715616 produces an analysis that relates to this larger population. I.e. FCR is predicted for the 693000 records which do not have data, using pedigree relationships and trait covariance at genetic and error levels.
Since the SD of ADG is greater for the full data than the subset, the model implies the variance of FCR is greater for the full set, than for the subset (with FCR).
Basically it is assumed that FCR is measured on a selected subset, but you want to predict parameter values for the whole population.
It speaks well of REML that the results are so consistent given the low proportion of records with FCR data. |
_________________ Arthur Gilmour
Retired Principal Research Scientist (Biometrics) |
|
| Back to top |
|
|
|