This clinical trial utilized a number of forms to collect information for each participant. Some forms, such as blood collection forms and return visit forms, were used as often as needed. These forms are referred to as 'multiple forms'. Other forms, such as the 6-month interview response form, were used only once during the participant's relationship with the trial. These other forms are referred to as 'unique forms'. Data from multiple forms were aggregated into variables that have an 'x' as a prefix. All data from these forms were not automatically included in the VCT database. Instead, pertinent information that could be collected from these forms were identified and subsequently calculated. Note that results of the STD tests done at UCSF are treated as data from multiple forms whereas results of STD tests done at the sites are treated as data from a unique form. See other documentation ("created variables.txt") for what data these 'x' variables are and what information they contain. For the most part, all data from unique forms were included in the VCT database. Most questions without a question number are included. Variable names for these questions are indicated in other documentation ("created variables.txt"). All questions with a question number are included in the VCT database. The variable name associated with these questions starts with a prefix, a 'q', then the question number. The prefix is specific to the unique form under consideration: Prefix Form ------ -------------------------- A enrollment eligibility form B baseline interview response form C initial baseline visit form D 6-month interview response form E 6-month STD site test F 6-month counselor's questionnaire G 12-month interview response form H 12-month counselor's questionnaire What follows the question number in the variable name depends on the question itself. Usually, nothing more is needed for the variable name, since typically questions have mutually exclusive responses ("choose only one"). (Note that all of the following questions are mere examples and do not actually exist in any one form.) For example, say hypothetically that in the 6-month counselor's questionnaire there exists a question: Question 24b: In the last 48 hours, have you been at work? 1 yes 2 no For this question, variable fq24b is created. However, some questions allow one to choose multiple choices ("check all that apply"). For these questions, an underscore ("_") follows the question number in the variable name, which is in turn followed by the value assigned to each choice. For such questions, for each choice one variable is created, and each will have a value of missing, zero, or one. For example, say hypothetically that in the baseline interview response form there exists a question: Question 3: Where have you been in the last 48 hours? (check all that apply) 1 at home 2 at work 3 at school For this question there are three variables: bq3_1, bq3_2, and bq3_3. If a question has a text field ("please specify") an additional variable is created with a 'tx' after the question number (no underscore is used). Note that the text field itself might be an option, in which case this option has its own variable. This is done since it is possible that participants chose the option that would direct them to fill in the text field, but for whatever reason did not fill the field out. Making the previous example more complex: Question 3: Where have you been in the last 48 hours? (check all that apply) 1 at home 2 at work 3 at school 4 other (please specify:_______) For this question there are five variables: bq3_1, bq3_2, bq3_3, bq3_4, and bq3tx. Sometimes questions that have a text field also have options that have their own sub-question number since each option itself has a range of possible responses. In such situations the text field is associated with the question number and not the sub-question number of the option that contains the text field. For example, modifying the previous example: Question 3: Where have you been in the last 48 hours? (check all that apply) Yes No Don't Know --- -- ---------- a at home 1 2 88 b at work 1 2 88 c at school 1 2 88 d other (please specify:_______) 1 2 88 For this question there are five variables: bq3a, bq3b, bq3c, bq3d, and bq3tx. There is no variable named bq3dtx. Sometimes questions have mutually exclusive options, each with a text field. In such situations all the text fields are represented by a single variable. Since the options are mutually exclusive, at most one text field will be utilized at any one time, so there is no need to make a text field variable for each option. For example, say hypothetically that in the 12-month counselor's questionnaire there exists a question: Question 17: Where were you 48 hours ago? 1 at home (explain why not at work: _______________) 2 at work (explain why not at home: _______________) For this question two variables are created: hq17 and hq17tx. Questions occasionally query results in terms of the number of days, weeks, months, and years. For these questions, these four time measurements were calculated in terms of months and added together to create a single variable. For example, say hypothetically that in the enrollment eligibility form there exists a question: Question 9: If unemployed, how long has it been since you last worked? ___ days ___ weeks ___ months ___ years For this example one variable is created: aq9. This variable contains the number of months since the participant last worked. Some questions (typically ones that require a continuous value or ones that require to measure time) might include "decline" and/or "don't know" options. Rather than inserting the code of the "decline" or "don't know" in the variable to contain the response (since the value of these codes might be misleading), separate variables were usually created for each of these other options. Each of these separate variables will have a value of missing, zero, or one. If separate variables are not available then the value of the code is inserted in the response variable. For example, say hypothetically that in the enrollment eligibility form there exists a question: Question 12b: How many people do you have working for you at work? _______ people don't know 88 decline 99 For this example three variables are usually created: aq12b, aq12b_88, and aq12b_99. Aq12b contains the number of people. Aq12b_88 and aq12b_99 will be set to the value of one if "don't know" or "decline" is chosen, respectively. Note that for all questions where an underscore is warranted, due to space restrictions sometimes the underscore was skipped in order to keep the variable name to no more than eight characters in length. So a hypothetical variable dq27b4_99 is truncated to dq27b499. Also note that not all variables in the VCT database are associated directly with a specific question in a form. Some variables are derived from a combination of questions across the multiple and/or unique forms. Such variables will have either a prefix of 'x' or no prefix at all (even though the first character in the name might be an "a" through "h", such as "age" and "gender"). That "q" is not the second character in the variable name should serve as a reminder that the variable is not associated with a specific question. (end of document)