From: Precision information extraction for rare disease epidemiology at scale
Entity Class | Label | Definition | Example |
---|---|---|---|
Disease terms | DIS | Rare and non-rare disease names and synonyms including those which have a unique ID or code (ICD, GARD, UMLS). Includes pathogenic diseases, but not pathogens. Does not include symptoms, features of diseases, phenotypes, nor abbreviations of disease names | “Wegener's granulomatosis”, “Metachromatic leukodystrophy”, “Krabbe disease” |
Disease abbreviations | ABRV | Abbreviations of the disease names or synonyms described above | “MPS” (Mucopolysaccharidoses), “FSHD” (Facioscapulohumeral muscular dystrophy) |
Epidemiology Type | EPI | The epidemiologic metric being reported | “Annualized incidence”, “point prevalence”, “estimated occurrence rate” |
Epidemiology Rate | STAT | The number of people afflicted. Usually expressed as a fraction (rate), a percentage of the (sub)population, or an integer estimation/count of persons with the disease | “Approximately 1 in 40,000 live births”, “50,000 people affected” |
Location | LOC | Locations, including geopolitical entities, which indicate where the study took place | “North-Central Africa”, “Salla region of northern Finland”, “the United States” |
Dates | DATE | When the study took place or when data was gathered | “Between 1985 and 2006”, “January 21, 1999” |
Biological Sex | SEX | Terms that were likely to indicate the biological sex of the persons mentioned in the study | “Men”, “women”, “intersex” |
Ethnicity/Nationality/Race | ETHN | Terms that are likely to indicate nationality, race, or ethnicity of the persons afflicted by the disease | “Italian”, “Ashkenazi Jew”, “Marshallese” |