A.4.1 Understand the document format
You’ll be given a table. Each row has 6 fields already filled (See below the section called
“List of given fields”) which contain information from a tweet. Following, there are the fields that have to be filled by you using the information contained on the first 6 fields from that same row.
Appendix A.Expert annotator guidelines for annotating first-hand experience tweets.143
Figure A.1: Flowchart describing the annotation sequence used in first-hand experi-ence tweets.
A.4.2 Understand each field
In order to fill the document we’ve created a table to explain how to fill each field. In Table 3 (by the end of this document) you will see the list of fields that you will be asked to annotate. This table has 3 columns. The first one corresponds to the name of the field as it’s listed on the document that you have to fill. The second field within the table is a brief explanation of what the first field means. The third field contains the values that you can input when filling the document. Apart from the explanations within the table there are a few clarifications that have to be made:
• You shouldn’t make any guess nor assumption. Just obtain the information by using the available data from the first 6 fields.
• In some fields there’s the “empty” value listed among the available values that can be entered. If you are not sure about what to enter on that field, please leave it blank.
• Please note that in most of the cases there’s a closed list of values, but for some fields there isn’t a closed list of values. Also, more than one value can be entered within some fields.
• The fields “Symptoms causing the use”, and “Symptoms after the use” have to use strings from a controlled vocabulary. To see how to fill these fields, please read next section.
A.4.3 Fill the fields
On each row, use all the given information (See below the section called“List of given fields”) to fill the fields. As stated above, the first fields contain the information that you will use to fill the rest of the fields (See below the section called “List of fields to be filled”) on the same row.
• Keep in mind that each row is independent, so you only have to make use of the information within the first fields. No information obtained from other rows should be used when filling a row. Just use the information from current row.
• The fields “Symptoms causing the use”, “Symptoms after the use” and “Country”
have to be filled in a particular way, explained next on the following section:
“Special fields”.
Appendix A.Expert annotator guidelines for annotating first-hand experience tweets.145 Excel sheet label Synonyms
Adderall Amphetamine mixed salts, amphetamine and dextroamphetamine, amphetamine salt Ritalin
Concerta; Methylphenidate; Methylin; Metadate;
Equasym XL; Daytrana; Phenida; Attenta;
Hynidate; Focalin; Attenade; Quillivant; methyl phenyl(piperidin-2-yl)acetate
Modafinil
Modafinilo, Modafinilum, Moderateafinil, Modiodal, Provigil, Sparlon, Alertec, Modavigil,
Modalert,()-2-(benzhydrylsulfinyl)acetamide Adrafinil CRL-40028, Olmifon, CRL 40028,
(RS)-2-benzhydrylsulfinylethanehydroxamic acid
Armodafinil Nuvigil
Citalopram Celexa
Escitalopram Lexapro, Cipralex Paroxetine Paxil, Seroxat
Fluoxetine Prozac
Fluvoxamine Luvox
Sertraline Zoloft, Lustral
Table A.1: List of drug names along with the synonyms.
A.4.4 Special fields
About the drug
The excel sheets to be annotated are “Adderall”, “Ritalin”, “Modafinil”, “Adrafinil”,
“Armodafinil”, “Citalopram”, “Escitalopram”, “Paroxetine”, “Fluoxetine”, “Fluvoxam-ine” and “Sertral“Fluvoxam-ine”.
On each excel sheet we only care about one drug and all the synonyms for such drug.
This means that only in case the drug or a synonym for such drug are mentioned within the tweet text, the value of the field“About the drug?” would be“1”. Otherwise it would be left empty. Table A.1 shows the list of each drug along with its synonyms.
As an example. If we are annotating tweets on the excel sheet named “Adderall”, a tweet that is considered to be “About the drug” (“About the drug?” field would be annotated as “1”) would be a tweet that mentions any of the following drugs on the tweet text:
• “Adderall”or
• “Amphetamine mixed salts”or
• “amphetamine and dextroamphetamine” or
• “amphetamine salt”
As you can see, these are the drugs on “Excel sheet label” and on “Synonyms” on the table above. If none of these drugs appears on the tweet text, the field “About the drug?” is left empty and the annotator doesn’t have to continue annotating that tweet.
It’s important to keep in mind what drugs are considered on each excel sheet. For example, if the annotator is annotating a tweet on“Adderall”sheet, in case such tweet just mentions another drug of this study such as “Concerta” (which is a synonym of “Ritalin”) and none of the drugs mentioned above (“Adderall”, “Amphetamine mixed salts”, “amphetamine and dextroamphetamine”, “amphetamine salt”) this tweet wouldn’t be considered to be“About the drug”, so“About the drug?” field would be left blank in this case. On the other hand, if the annotator would be annotating the same tweet in the excel sheet named “Ritalin” this same tweet would have the field
“About the drug?” annotated as“1”as in this case case“Concerta” is a synonym of “Ritalin”.
CUI Identifier
For filling “Symptoms causing the use” and “Symptoms after the use” you have to find all the terms related to symptoms after and before the drug usage. Changes on the behaviour or how the user feels before and after taking the drug will show this symptoms. Once this symptoms have been found they have to be entered in a normal-ized way. To do so we have to useConsumer Health Vocabulary’s website: http://
consumerhealthvocab.chpc.utah.edu/CHVwiki/index.jsp?orgDitchnetTabPaneId=
searchPaneThere, on the“CHV Entry Search”search box mark the option “Term”
(“Search by” option) in case it’s not selected. Then enter the term on the search box.1 Finally, just click on the “Search” button.
In Figure A.2 there’s the example where we entered the term“fatigue”:
Once you get the results you’ll have to use the “CUI”value for such symptom.
Figure A.3 shows the“CUI” value for “fatigue”, which is “C0015672”.
In some cases the word may not appear if it’s not searched in this way. In case of
“fatigued” the search showed results (Also pointing out that the Consumer Health Vocabulary preferred name is the noun form of the word, “fatigue”). On the other
1TheConsumer Health Vocabulary “Preferred Name” is the noun form of the word, so instead of using adjectives (eg: “motivated” or “fatigued”) you’ll have to enter the noun corresponding to such adjective (“motivation” or “fatigue”).
Appendix A.Expert annotator guidelines for annotating first-hand experience tweets.147
Figure A.2: CHV Entry Search Box.
Figure A.3: CHV results for the term fatigue.
Figure A.4: No results messages in CHV.
hand, in case of using the term “motivation” Figure A.4 shows that such search doesn’t provide any result:
Country Code
To get the country code the annotator will be asked to only use the information from the “location” field (3rd column on the excel sheet) to obtain the country information.
The country codes that can be entered are the “ISO 3166-1 alpha-2” code, which are two-letter country codes to represent countries, dependent territories, and special areas of geographical interest.
Figure A.5: Country code selection.
Tag Explanation
Tweet date Date when the tweet was published.
Username Name of the user.
Location Location where the user is based.
Tweet text Text for the tweet.
Hashtags (#) List of hashtags (if any) within the tweet.
Mentions (@) List of mentions (if any) within the tweet.
Table A.2: List of fields that are provided with data to the annotators.
You can get a list of available “ISO 3166-1 alpha-2” code on this URL http://en.
wikipedia.org/wiki/ISO_3166-1_alpha-2#Officially_assigned_code_elements. The
“ISO 3166-1 alpha-2” code that has to be used on the column “Code”. In order to get the code the annotator has to look for the country on the column named “Country name” and then get the corresponding two-letter “Code” for such country.
In the case of having a Tweet that is from United States of America, the country code that has to be entered on the “Country” cell would be the two-letter code “US”, as shown in the Figure A.5. If the country is unknown, please leave this field empty.