Skip past navigation to main part of page
 
Melbourne Institute Homepage
---

Documentation

Documentation Choices

Before you get lost in the array of documentation, it is worth pausing to consider how you work and what documentation is available to you. You will not need to look at all pieces of documentation that have been prepared in order to use the datasets efficiently.

There are four main pathways through the documentation:

  • Marked-up questionnaires and the derived variable coding framework for each wave – you would use these if you were familiar with the questionnaires and wanted to know what extra variables have been included;
  • File-based coding framework for each wave– you would use this if you were roughly aware of what variables were in what files and were interested in a range of different topics;
  • Subject-level coding framework for each wave – you would use this if you were interested in a couple of different topics;
  • Cross-wave variable listing – you would use this if you were frequently using variables across the various waves, and were happy to find out the codes used as you use the variables.

The coding frameworks have been provided on the DVD (as .pdf documents) as well as via an on–line dictionary.

You should also consider which files you want to print out and which you are happy to look at electronically. We have found that the marked-up questionnaires are best printed. The rest are best looked at on screen where there are search functions available.21

While frequencies of the variables have been provided, it is expected that you might only refer to these files for some simple queries with the variable name in mind (for example, how many employed people do we have in the sample, or what are the codes used for question R3).

Also, as you may have already seen, the previous chapter of this manual provides an overview of the topics covered in the questionnaires and the derived variables created.

These tools are described in more detail below.

Marked-Up Questionnaires

Beside each question in the questionnaires, the associated variable name has been added. Derived variables are not included, only the variables that relate directly to the question asked. See Figure 23 for an example.

Figure 23: Example of the marked-up questionnaires

Variable Listings

Derived Variable Listing

The derived variable listing contains all the extra variables created from those collected in the questionnaires. This listing shows the following:

  • on which file the variable can be found;
  • the variable name;
  • the label describing the variable;
  • what values the variable can take; and
  • to which population the variable relates.

Figure 24 shows the derived variable associated with the variables listed on the questionnaire in Figure 4.

Figure 24: Example of the derived variable listing

File–Based Listing

For each file provided (except for the Combined File), there is a file–based variable listing. This listing contains:

  • the questionnaire and question number;
  • the variable name and label describing the variable;
  • the values that each variable can take;
  • the population to which the variable relates; and
  • for derived variables, a brief explanation of how the variable was derived.

In this listing, the derived variables are interspersed with the variables directly from the questionnaires. See Figure 25 below:

Figure 25: Example of the file–based listing

Subject Listing

The subject listing is similar to the file-based listing, but includes the variables of all files together in one listing. There is an index at the beginning and the broad subject name is at the top of each page to help you navigate through the 170 to 300 page document. See Figure 26 below:

Figure 26: Example of the subject listing

Cross-Wave Variable Listing

The cross-wave variable listing is probably the most useful tool of all the documentation options. It provides information on the file where the variable can be found, the label and in which wave the variable has been asked. For the particular example provided in Figure 27, we can see that these questions have changed from section H in wave 1 to section G in later waves, and that the question numbering has changed slightly between waves 2 and 3.

Figure 27: Example of the cross–wave variable listing

Frequencies

The frequencies are a simple listing of the categories for each question and the number of cases falling into each category. Figure 28 provides an example of the listing.

Figure 28: Example of the frequencies

On–line Data Dictionary

For Release 6, we have included a new addition to our HILDA documentation: the On–line Data Dictionary. As this is the first time the On–line Data Dictionary has been made available, we would welcome your feedback and suggestions.

The On–line Data Dictionary can be accessed via the HILDA website:

This on–line system is is designed to provide easy access to HILDA metadata. The database essentially provides the user with the information available in HILDA coding frameworks (.pdf).

The On–line Data Dictionary allows users to search HILDA metadata four different ways:

  • by keyword;
  • by subject area;
  • by variable name; and
  • by derived variable name.

A help page (accessed by clicking on the help icon at the bottom right of the page) provides instructions on how to use the system along with example screen shots.

This system is still a work in progress, so you can expect it will be added to during 2008. Note that the questionnaire text is only currently available for wave 6 (the text for other waves will be added in due course).

 

top of pagetop of page

HILDA Contact us

Contact the University : Disclaimer & Copyright : Privacy : Accessibility