Title: | Epidemiology data dictionaries and random data generators |
---|---|
Description: | The 'R4EPIs' project <https://R4epis.netlify.com> seeks to provide a set of standardized tools for analysis of outbreak and survey data in humanitarian aid settings. This package currently provides standardized data dictionaries from MSF OCA for four outbreak scenarios (Acute Jaundice Syndrome, Cholera, Measles, Meningitis) and three surveys (Retrospective mortality and access to care, Malnutrition, and Vaccination coverage). In addition, a data generator from these dictionaries is provided. |
Authors: | Alexander Spina [aut, cre] , Zhian N. Kamvar [aut] , Lukas Richter [aut], Patrick Keating [aut], Annick Lenglet [ctb] |
Maintainer: | Alexander Spina <[email protected]> |
License: | GPL-3 |
Version: | 0.0.0.9001 |
Built: | 2025-01-15 05:41:23 UTC |
Source: | https://github.com/r4epi/epidict |
Based on a dictionary generator like msf_dict()
or msf_dict_survey()
,
this function will generate a randomized data set based on values defined in
the dictionaries. The randomized dataset produced should mimic an excel
export from DHIS2 for outbreaks and a Kobo export for surveys.
gen_data( dictionary, varnames = "data_element_shortname", numcases = 300, org = "MSF" )
gen_data( dictionary, varnames = "data_element_shortname", numcases = 300, org = "MSF" )
dictionary |
Specify which dictionary you would like to use. |
varnames |
Specify name of column that contains variable names.
If |
numcases |
Specify the number of cases you want (default is 300) |
org |
the organization the dictionary belongs to. Currently, only MSF exists. In the future, dictionaries from WHO and other organizations may become available. |
a data frame with cases in rows and variables in columns. The number of columns will vary from dictionary to dictionary, so please use the dictionary functions to generate a corresponding dictionary.
if (require("dplyr") & require("matchmaker")) { withAutoprint({ # You will often want to use MSF dictionaries to translate codes to human- # readable variables. Here, we generate a data set of 20 cases: dat <- gen_data( dictionary = "Cholera", varnames = "data_element_shortname", numcases = 20, org = "MSF" ) print(dat) # We want the expanded dictionary, so we will select `compact = FALSE` dict <- msf_dict(disease = "Cholera", long = TRUE, compact = FALSE, tibble = TRUE) print(dict) # Now we can use matchmaker to filter the data: dat_clean <- matchmaker::match_df(dat, dict, from = "option_code", to = "option_name", by = "data_element_shortname", order = "option_order_in_set" ) print(dat_clean) }) }
if (require("dplyr") & require("matchmaker")) { withAutoprint({ # You will often want to use MSF dictionaries to translate codes to human- # readable variables. Here, we generate a data set of 20 cases: dat <- gen_data( dictionary = "Cholera", varnames = "data_element_shortname", numcases = 20, org = "MSF" ) print(dat) # We want the expanded dictionary, so we will select `compact = FALSE` dict <- msf_dict(disease = "Cholera", long = TRUE, compact = FALSE, tibble = TRUE) print(dict) # Now we can use matchmaker to filter the data: dat_clean <- matchmaker::match_df(dat, dict, from = "option_code", to = "option_name", by = "data_element_shortname", order = "option_order_in_set" ) print(dat_clean) }) }
These function produces MSF OCA dictionaries based on DHIS2 (for outbreaks) and Kobo (for surveys) data sets defining the data element name, code, short names, types, and key/value pairs for translating the codes into human-readable format.
msf_dict( disease, name = "MSF-outbreak-dict.xlsx", tibble = TRUE, compact = TRUE, long = TRUE ) msf_dict_survey( disease, name = "MSF-survey-dict.xlsx", tibble = TRUE, compact = TRUE, long = TRUE, template = TRUE )
msf_dict( disease, name = "MSF-outbreak-dict.xlsx", tibble = TRUE, compact = TRUE, long = TRUE ) msf_dict_survey( disease, name = "MSF-survey-dict.xlsx", tibble = TRUE, compact = TRUE, long = TRUE, template = TRUE )
disease |
Specify which disease you would like to use.
|
name |
the name of the dictionary stored in the package.
|
tibble |
Return data dictionary as a tidyverse tibble (default is TRUE) |
compact |
if |
long |
If @param template Only used for |
template |
(for survey dictionaries): if |
matchmaker::match_df()
gen_data()
msf_dict_survey()
if (require("dplyr") & require("matchmaker")) { withAutoprint({ # You will often want to use MSF dictionaries to translate codes to human- # readable variables. Here, we generate a data set of 20 cases: dat <- gen_data( dictionary = "Cholera", varnames = "data_element_shortname", numcases = 20, org = "MSF" ) print(dat) # We want the expanded dictionary, so we will select `compact = FALSE` dict <- msf_dict(disease = "Cholera", long = TRUE, compact = FALSE, tibble = TRUE) print(dict) # Now we can use matchmaker to filter the data: dat_clean <- matchmaker::match_df(dat, dict, from = "option_code", to = "option_name", by = "data_element_shortname", order = "option_order_in_set" ) print(dat_clean) }) }
if (require("dplyr") & require("matchmaker")) { withAutoprint({ # You will often want to use MSF dictionaries to translate codes to human- # readable variables. Here, we generate a data set of 20 cases: dat <- gen_data( dictionary = "Cholera", varnames = "data_element_shortname", numcases = 20, org = "MSF" ) print(dat) # We want the expanded dictionary, so we will select `compact = FALSE` dict <- msf_dict(disease = "Cholera", long = TRUE, compact = FALSE, tibble = TRUE) print(dict) # Now we can use matchmaker to filter the data: dat_clean <- matchmaker::match_df(dat, dict, from = "option_code", to = "option_name", by = "data_element_shortname", order = "option_order_in_set" ) print(dat_clean) }) }
Helper for aligning your data to a standardised dictionary or your own dictionary.
msf_dict_rename_helper( disease, name, varnames = "data_element_shortname", varnames_type, rmd, template = TRUE, copy_to_clipboard = TRUE )
msf_dict_rename_helper( disease, name, varnames = "data_element_shortname", varnames_type, rmd, template = TRUE, copy_to_clipboard = TRUE )
disease |
Specify which disease you would like to use. Currently supports "Cholera", "Measles", "Meningitis", "AJS", "Mortality", "Nutrition", "Vaccination_short" and "Vaccination_long". |
name |
The name of the dictionary stored in the package. The default
will use dictionaries from the package. However you can also use
dictionaries not stored within this package, to use these:
specify |
varnames |
The name of column that contains variable names. The
default set to "data_element_shortname".
If |
varnames_type |
The name of column that contains the variable type.
The default will use "data_element_valuetype" for DHIS2 and "type"
for Kobo dictionaries. If you specify your own dictionary then this needs to
be the same length as |
rmd |
The Rmarkdown template which you would like to compare to. Default
is will use those included in the package. However you can also use
Rmarkdowns not stored within this package, to use these:
specify |
template |
If |
copy_to_clipboard |
if |
A dplyr command used to rename columns in your data frame according to the dictionary