|
|
ConfidentialisationThe HILDA datasets released have been confidentialised to reduce the risk that individual sample members can be identified. This has involved:
Top-coding substitutes an average value for all the cases which are equal to or exceed a given threshold. The substituted value is calculated as the weighted average of the cases subject to top-coding. As a result, the cross-sectionally weighted means of the top-coded variable will be the same as the original variable. (In earlier releases, the cut-off value was used which failed to preserve the weighted means.) Take, for example, the top-coding of _wscg (current gross wages per week in main job). All cases whose wages are equal or exceed $4800 have had their value replaced by the weighted average of all those cases whose income is equal to or exceeds $4800. Let us say that the weighted average of the 22 cases earning $4800 or more is $8450. $8450 is then substituted as the wages for those 22 cases. This maximizes confidentiality and preserves the weighted distribution means. If the distribution of wages had been simply cut off at $4800, when the relevant weights are applied, the value would be too low. The top-coding thresholds are adjusted over time to overcome the tendency of income and wealth measures to inflate. Without adjustment, increasing numbers of cases would exceed the threshold and be topcoded. If you need to know the threshold values that have been used at a particular release, please contact hilda-inquiries@unimelb.edu.au.
|
|
|
Contact the University : Disclaimer & Copyright : Privacy : Accessibility |
|
Date Created: 30 January 2005 |
The University of Melbourne ABN: 84 002 705 224 |