| Title: | Data Quality in Epidemiological Research |
|---|---|
| Description: | Data quality assessments guided by a 'data quality framework introduced by Schmidt and colleagues, 2021' <doi:10.1186/s12874-021-01252-7> target the data quality dimensions integrity, completeness, consistency, and accuracy. The scope of applicable functions rests on the availability of extensive metadata which can be provided in spreadsheet tables. Either standardized (e.g. as 'html5' reports) or individually tailored reports can be generated. For an introduction into the specification of corresponding metadata, please refer to the 'package website' <https://dataquality.qihs.uni-greifswald.de/VIN_Annotation_of_Metadata.html>. |
| Authors: | University Medicine Greifswald [cph], Elisa Kasbohm [aut] (ORCID: <https://orcid.org/0000-0001-5261-538X>), Elena Salogni [aut] (ORCID: <https://orcid.org/0009-0007-3767-7145>), Joany Marino [aut] (ORCID: <https://orcid.org/0000-0002-4657-3758>), Adrian Richter [aut] (ORCID: <https://orcid.org/0000-0002-3372-2021>), Carsten Oliver Schmidt [aut] (ORCID: <https://orcid.org/0000-0001-5266-9396>), Stephan Struckmann [aut, cre] (ORCID: <https://orcid.org/0000-0002-8565-7962>), German Research Foundation (DFG SCHM 2744/3-1, SCHM 2744/9-1, SCHM 2744/3-4) [fnd], National Research Data Infrastructure for Personal Health Data: (NFDI 13/1) [fnd], European Union’s Horizon 2020 programme (euCanSHare, grant agreement No. 825903) [fnd] |
| Maintainer: | Stephan Struckmann <[email protected]> |
| License: | BSD_2_clause + file LICENSE |
| Version: | 2.8.9 |
| Built: | 2026-05-11 22:09:21 UTC |
| Source: | https://gitlab.com/libreumg/dataquier |
Operator caring for units
## S3 method for class 'numeric_with_unit' e1 - e2## S3 method for class 'numeric_with_unit' e1 - e2
e1 |
first argument |
e2 |
second argument |
result
dataquieR dq_report2 reportGet a subset of a dataquieR dq_report2 report
## S3 method for class 'dataquieR_resultset2' x[row, col, res, drop = FALSE, els = row, as_raw = FALSE]## S3 method for class 'dataquieR_resultset2' x[row, col, res, drop = FALSE, els = row, as_raw = FALSE]
x |
the report |
row |
the variable names, must be unique |
col |
the function-call-names, must be unique |
res |
the result slot, must be unique |
drop |
drop, if length is 1 |
els |
used, if in list-mode with named argument |
as_raw |
retrieve the result maybe as compressed |
a list with results, depending on drop and the number of results,
the list may contain all requested results in sub-lists. The order
of the results follows the order of the row/column/result-names given
dataquieR 2 reportGet a single result from a dataquieR 2 report
## S3 method for class 'dataquieR_resultset2' x[[el]]## S3 method for class 'dataquieR_resultset2' x[[el]]
x |
the report |
el |
the index |
the dataquieR result object
dataquieR 2 reportSet a single result from a dataquieR 2 report
## S3 replacement method for class 'dataquieR_resultset2' x[[el]] <- value## S3 replacement method for class 'dataquieR_resultset2' x[[el]] <- value
x |
the report |
el |
the index |
value |
the single result |
the dataquieR result object
Overwriting of elements only list-wise supported
## S3 replacement method for class 'dataquieR_resultset2' x[...] <- value## S3 replacement method for class 'dataquieR_resultset2' x[...] <- value
x |
a 'dataquieR_resultset2 |
... |
if this contains only one entry and this entry is not named
or its name is |
value |
new value to write |
nothing, stops
Operator caring for units
## S3 method for class 'numeric_with_unit' e1 * e2## S3 method for class 'numeric_with_unit' e1 * e2
e1 |
first argument |
e2 |
second argument |
result
Operator caring for units
## S3 method for class 'numeric_with_unit' e1 / e2## S3 method for class 'numeric_with_unit' e1 / e2
e1 |
first argument |
e2 |
second argument |
result
Operator caring for units
## S3 method for class 'numeric_with_unit' e1 %/% e2## S3 method for class 'numeric_with_unit' e1 %/% e2
e1 |
first argument |
e2 |
second argument |
result
Operator caring for units
## S3 method for class 'numeric_with_unit' e1 %% e2## S3 method for class 'numeric_with_unit' e1 %% e2
e1 |
first argument |
e2 |
second argument |
result
Operator caring for units
## S3 method for class 'numeric_with_unit' e1 ^ e2## S3 method for class 'numeric_with_unit' e1 ^ e2
e1 |
first argument |
e2 |
second argument |
result
Operator caring for units
## S3 method for class 'numeric_with_unit' e1 + e2## S3 method for class 'numeric_with_unit' e1 + e2
e1 |
first argument |
e2 |
second argument |
result
Access single results from a dataquieR_resultset2 report
## S3 method for class 'dataquieR_resultset2' x$el## S3 method for class 'dataquieR_resultset2' x$el
x |
the report |
el |
the index |
the dataquieR result object
Write single results from a dataquieR_resultset2 report
## S3 replacement method for class 'dataquieR_resultset2' x$el <- value## S3 replacement method for class 'dataquieR_resultset2' x$el <- value
x |
the report |
el |
the index |
value |
the single result |
the dataquieR result object
This function creates distribution plots for categorical variables.
acc_cat_distributions( resp_vars = NULL, group_vars = NULL, study_data, label_col, item_level = "item_level", meta_data = item_level, meta_data_v2, n_cat_max = getOption("dataquieR.max_cat_resp_var_levels_in_plot", dataquieR.max_cat_resp_var_levels_in_plot_default), n_group_max = getOption("dataquieR.max_group_var_levels_in_plot", dataquieR.max_group_var_levels_in_plot_default), n_data_min = getOption("dataquieR.min_time_points_for_cat_resp_var", dataquieR.min_time_points_for_cat_resp_var_default) )acc_cat_distributions( resp_vars = NULL, group_vars = NULL, study_data, label_col, item_level = "item_level", meta_data = item_level, meta_data_v2, n_cat_max = getOption("dataquieR.max_cat_resp_var_levels_in_plot", dataquieR.max_cat_resp_var_levels_in_plot_default), n_group_max = getOption("dataquieR.max_group_var_levels_in_plot", dataquieR.max_group_var_levels_in_plot_default), n_data_min = getOption("dataquieR.min_time_points_for_cat_resp_var", dataquieR.min_time_points_for_cat_resp_var_default) )
resp_vars |
variable the name of the measurement variable |
group_vars |
variable the name of the observer, device or reader variable |
study_data |
data.frame the data frame that contains the measurements |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
item_level |
data.frame the data frame that contains metadata attributes of study data |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
n_cat_max |
maximum number of categories to be displayed individually
for the categorical variable ( |
n_group_max |
maximum number of categories to be displayed individually
for the grouping variable ( |
n_data_min |
minimum number of data points to create a time course plot
for an individual category of the |
To complete
A list with:
SummaryPlot: ggplot2::ggplot for the response variable in
resp_vars.
Data quality indicator checks "Unexpected location" and "Unexpected proportion" with histograms.
acc_distributions( resp_vars = NULL, study_data, label_col, item_level = "item_level", check_param = c("any", "location", "proportion"), plot_ranges = TRUE, flip_mode = "noflip", meta_data = item_level, meta_data_v2 )acc_distributions( resp_vars = NULL, study_data, label_col, item_level = "item_level", check_param = c("any", "location", "proportion"), plot_ranges = TRUE, flip_mode = "noflip", meta_data = item_level, meta_data_v2 )
resp_vars |
variable list the names of the measurement variables |
study_data |
data.frame the data frame that contains the measurements |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
item_level |
data.frame the data frame that contains metadata attributes of study data |
check_param |
enum any | location | proportion. Which type of check should be conducted (if possible): a check on the location of the mean or median value of the study data, a check on proportions of categories, or either of them if the necessary metadata is available. |
plot_ranges |
logical Should the plot show ranges and results from the data quality checks? (default: TRUE) |
flip_mode |
enum default | flip | noflip | auto. Should the plot be
in default orientation, flipped, not flipped or
auto-flipped. Not all options are always supported.
In general, this con be controlled by
setting the |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
A list with:
SummaryTable: data.frame containing data quality checks for
"Unexpected location" (FLG_acc_ud_loc) and "Unexpected
proportion" (FLG_acc_ud_prop) for each response
variable in resp_vars.
SummaryData: a data.frame containing data quality checks for
"Unexpected location" and / or "Unexpected proportion"
for a report
SummaryPlotList: list of ggplot2::ggplots for each response variable in
resp_vars.
If no response variable is defined, select all variables of type float or integer in the study data.
Remove missing codes from the study data (if defined in the metadata).
Remove measurements deviating from (hard) limits defined in the metadata (if defined).
Exclude variables containing only NA or only one unique value (excluding
NAs).
Perform check for "Unexpected location" if defined in the metadata (needs a LOCATION_METRIC (mean or median) and LOCATION_RANGE (range of expected values for the mean and median, respectively)).
Perform check for "Unexpected proportion" if defined in the metadata (needs PROPORTION_RANGE (range of expected values for the proportions of the categories)).
Plot histogram(s).
Data quality indicator checks "Unexpected location" and "Unexpected proportion" if a grouping variable is included: Plots of empirical cumulative distributions for the subgroups.
acc_distributions_ecdf( resp_vars = NULL, group_vars = NULL, study_data, label_col, item_level = "item_level", meta_data = item_level, meta_data_v2, n_group_max = getOption("dataquieR.max_group_var_levels_in_plot", dataquieR.max_group_var_levels_in_plot_default), n_obs_per_group_min = getOption("dataquieR.min_obs_per_group_var_in_plot", dataquieR.min_obs_per_group_var_in_plot_default) )acc_distributions_ecdf( resp_vars = NULL, group_vars = NULL, study_data, label_col, item_level = "item_level", meta_data = item_level, meta_data_v2, n_group_max = getOption("dataquieR.max_group_var_levels_in_plot", dataquieR.max_group_var_levels_in_plot_default), n_obs_per_group_min = getOption("dataquieR.min_obs_per_group_var_in_plot", dataquieR.min_obs_per_group_var_in_plot_default) )
resp_vars |
variable list the names of the measurement variables |
group_vars |
variable list the name of the observer, device or reader variable |
study_data |
data.frame the data frame that contains the measurements |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
item_level |
data.frame the data frame that contains metadata attributes of study data |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
n_group_max |
maximum number of categories to be displayed individually
for the grouping variable ( |
n_obs_per_group_min |
minimum number of data points per group to create
a graph for an individual category of the |
A list with:
SummaryPlotList: list of ggplot2::ggplots for each response variable in
resp_vars.
Data quality indicator checks "Unexpected location" and "Unexpected proportion" with histograms.
acc_distributions_loc( resp_vars = NULL, study_data, label_col = VAR_NAMES, item_level = "item_level", check_param = "location", plot_ranges = TRUE, flip_mode = "noflip", meta_data = item_level, meta_data_v2 )acc_distributions_loc( resp_vars = NULL, study_data, label_col = VAR_NAMES, item_level = "item_level", check_param = "location", plot_ranges = TRUE, flip_mode = "noflip", meta_data = item_level, meta_data_v2 )
resp_vars |
variable list the names of the measurement variables |
study_data |
data.frame the data frame that contains the measurements |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
item_level |
data.frame the data frame that contains metadata attributes of study data |
check_param |
enum any | location | proportion. Which type of check should be conducted (if possible): a check on the location of the mean or median value of the study data, a check on proportions of categories, or either of them if the necessary metadata is available. |
plot_ranges |
logical Should the plot show ranges and results from the data quality checks? (default: TRUE) |
flip_mode |
enum default | flip | noflip | auto. Should the plot be
in default orientation, flipped, not flipped or
auto-flipped. Not all options are always supported.
In general, this con be controlled by
setting the |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
A list with:
SummaryTable: data.frame containing data quality checks for
"Unexpected location" (FLG_acc_ud_loc) and "Unexpected
proportion" (FLG_acc_ud_prop) for each response
variable in resp_vars.
SummaryData: a data.frame containing data quality checks for
"Unexpected location" and / or "Unexpected proportion"
for a report
SummaryPlotList: list of ggplot2::ggplots for each response variable in
resp_vars.
If no response variable is defined, select all variables of type float or integer in the study data.
Remove missing codes from the study data (if defined in the metadata).
Remove measurements deviating from (hard) limits defined in the metadata (if defined).
Exclude variables containing only NA or only one unique value (excluding
NAs).
Perform check for "Unexpected location" if defined in the metadata (needs a LOCATION_METRIC (mean or median) and LOCATION_RANGE (range of expected values for the mean and median, respectively)).
Perform check for "Unexpected proportion" if defined in the metadata (needs PROPORTION_RANGE (range of expected values for the proportions of the categories)).
Plot histogram(s).
acc_distributions_only( resp_vars = NULL, study_data, label_col = VAR_NAMES, item_level = "item_level", flip_mode = "noflip", meta_data = item_level, meta_data_v2 )acc_distributions_only( resp_vars = NULL, study_data, label_col = VAR_NAMES, item_level = "item_level", flip_mode = "noflip", meta_data = item_level, meta_data_v2 )
resp_vars |
variable list the names of the measurement variables |
study_data |
data.frame the data frame that contains the measurements |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
item_level |
data.frame the data frame that contains metadata attributes of study data |
flip_mode |
enum default | flip | noflip | auto. Should the plot be
in default orientation, flipped, not flipped or
auto-flipped. Not all options are always supported.
In general, this con be controlled by
setting the |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
A list with:
SummaryTable: data.frame containing data quality checks for
"Unexpected location" (FLG_acc_ud_loc) and "Unexpected
proportion" (FLG_acc_ud_prop) for each response
variable in resp_vars.
SummaryData: a data.frame containing data quality checks for
"Unexpected location" and / or "Unexpected proportion"
for a report
SummaryPlotList: list of ggplot2::ggplots for each response variable in
resp_vars.
If no response variable is defined, select all variables of type float or integer in the study data.
Remove missing codes from the study data (if defined in the metadata).
Remove measurements deviating from (hard) limits defined in the metadata (if defined).
Exclude variables containing only NA or only one unique value (excluding
NAs).
Perform check for "Unexpected location" if defined in the metadata (needs a LOCATION_METRIC (mean or median) and LOCATION_RANGE (range of expected values for the mean and median, respectively)).
Perform check for "Unexpected proportion" if defined in the metadata (needs PROPORTION_RANGE (range of expected values for the proportions of the categories)).
Plot histogram(s).
Data quality indicator checks "Unexpected location" and "Unexpected proportion" with histograms.
acc_distributions_prop( resp_vars = NULL, study_data, label_col, item_level = "item_level", check_param = "proportion", plot_ranges = TRUE, flip_mode = "noflip", meta_data = item_level, meta_data_v2 )acc_distributions_prop( resp_vars = NULL, study_data, label_col, item_level = "item_level", check_param = "proportion", plot_ranges = TRUE, flip_mode = "noflip", meta_data = item_level, meta_data_v2 )
resp_vars |
variable list the names of the measurement variables |
study_data |
data.frame the data frame that contains the measurements |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
item_level |
data.frame the data frame that contains metadata attributes of study data |
check_param |
enum any | location | proportion. Which type of check should be conducted (if possible): a check on the location of the mean or median value of the study data, a check on proportions of categories, or either of them if the necessary metadata is available. |
plot_ranges |
logical Should the plot show ranges and results from the data quality checks? (default: TRUE) |
flip_mode |
enum default | flip | noflip | auto. Should the plot be
in default orientation, flipped, not flipped or
auto-flipped. Not all options are always supported.
In general, this con be controlled by
setting the |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
A list with:
SummaryTable: data.frame containing data quality checks for
"Unexpected location" (FLG_acc_ud_loc) and "Unexpected
proportion" (FLG_acc_ud_prop) for each response
variable in resp_vars.
SummaryData: a data.frame containing data quality checks for
"Unexpected location" and / or "Unexpected proportion"
for a report
SummaryPlotList: list of ggplot2::ggplots for each response variable in
resp_vars.
If no response variable is defined, select all variables of type float or integer in the study data.
Remove missing codes from the study data (if defined in the metadata).
Remove measurements deviating from (hard) limits defined in the metadata (if defined).
Exclude variables containing only NA or only one unique value (excluding
NAs).
Perform check for "Unexpected location" if defined in the metadata (needs a LOCATION_METRIC (mean or median) and LOCATION_RANGE (range of expected values for the mean and median, respectively)).
Perform check for "Unexpected proportion" if defined in the metadata (needs PROPORTION_RANGE (range of expected values for the proportions of the categories)).
Plot histogram(s).
This implementation contrasts the empirical distribution of a measurement variables against assumed distributions. The approach is adapted from the idea of rootograms (Tukey (1977)) which is also applicable for count data (Kleiber and Zeileis (2016)).
acc_end_digits( resp_vars = NULL, study_data, label_col, item_level = "item_level", meta_data = item_level, meta_data_v2 )acc_end_digits( resp_vars = NULL, study_data, label_col, item_level = "item_level", meta_data = item_level, meta_data_v2 )
resp_vars |
variable the names of the measurement variables, mandatory |
study_data |
data.frame the data frame that contains the measurements |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
item_level |
data.frame the data frame that contains metadata attributes of study data |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
a list with:
SummaryTable: data.frame with the columns Variables and FLG_acc_ud_shape
SummaryPlot: ggplot2 distribution plot comparing expected
with observed distribution
This implementation is restricted to data of type float or integer.
Missing codes are removed from resp_vars (if defined in the metadata)
The user must specify the column of the metadata containing probability distribution (currently only: normal, uniform, gamma)
Parameters of each distribution can be estimated from the data or are specified by the user
A histogram-like plot contrasts the empirical vs. the technical distribution
The following R implementation executes calculations for quality indicator "Unexpected location" (see here. Local regression (LOESS) is a versatile statistical method to explore an averaged course of time series measurements (Cleveland, Devlin, and Grosse 1988). In context of epidemiological data, repeated measurements using the same measurement device or by the same examiner can be considered a time series. LOESS allows to explore changes in these measurements over time.
acc_loess( resp_vars, group_vars = NULL, time_vars, co_vars = NULL, study_data, label_col = VAR_NAMES, item_level = "item_level", min_obs_in_subgroup = getOption("dataquieR.acc_loess.min_obs_in_subgroup", dataquieR.acc_loess.min_obs_in_subgroup_default), resolution = 80, comparison_lines = list(type = c("mean/sd", "quartiles"), color = "grey30", linetype = 2, sd_factor = 0.5), mark_time_points = getOption("dataquieR.acc_loess.mark_time_points", dataquieR.acc_loess.mark_time_points_default), plot_observations = getOption("dataquieR.acc_loess.plot_observations", dataquieR.acc_loess.plot_observations_default), plot_format = getOption("dataquieR.acc_loess.plot_format", dataquieR.acc_loess.plot_format_default), meta_data = item_level, meta_data_v2, n_group_max = getOption("dataquieR.max_group_var_levels_in_plot", dataquieR.max_group_var_levels_in_plot_default), enable_GAM = getOption("dataquieR.GAM_for_LOESS", dataquieR.GAM_for_LOESS_default), exclude_constant_subgroups = getOption("dataquieR.acc_loess.exclude_constant_subgroups", dataquieR.acc_loess.exclude_constant_subgroups_default), min_bandwidth = getOption("dataquieR.acc_loess.min_bw", dataquieR.acc_loess.min_bw_default), min_proportion = getOption("dataquieR.acc_loess.min_proportion", dataquieR.acc_loess.min_proportion_default) )acc_loess( resp_vars, group_vars = NULL, time_vars, co_vars = NULL, study_data, label_col = VAR_NAMES, item_level = "item_level", min_obs_in_subgroup = getOption("dataquieR.acc_loess.min_obs_in_subgroup", dataquieR.acc_loess.min_obs_in_subgroup_default), resolution = 80, comparison_lines = list(type = c("mean/sd", "quartiles"), color = "grey30", linetype = 2, sd_factor = 0.5), mark_time_points = getOption("dataquieR.acc_loess.mark_time_points", dataquieR.acc_loess.mark_time_points_default), plot_observations = getOption("dataquieR.acc_loess.plot_observations", dataquieR.acc_loess.plot_observations_default), plot_format = getOption("dataquieR.acc_loess.plot_format", dataquieR.acc_loess.plot_format_default), meta_data = item_level, meta_data_v2, n_group_max = getOption("dataquieR.max_group_var_levels_in_plot", dataquieR.max_group_var_levels_in_plot_default), enable_GAM = getOption("dataquieR.GAM_for_LOESS", dataquieR.GAM_for_LOESS_default), exclude_constant_subgroups = getOption("dataquieR.acc_loess.exclude_constant_subgroups", dataquieR.acc_loess.exclude_constant_subgroups_default), min_bandwidth = getOption("dataquieR.acc_loess.min_bw", dataquieR.acc_loess.min_bw_default), min_proportion = getOption("dataquieR.acc_loess.min_proportion", dataquieR.acc_loess.min_proportion_default) )
resp_vars |
variable the name of the continuous measurement variable |
group_vars |
variable the name of the observer, device or reader variable |
time_vars |
variable the name of the variable giving the time of measurement |
co_vars |
variable list a vector of covariables for adjustment, for example age and sex. Can be NULL (default) for no adjustment. |
study_data |
data.frame the data frame that contains the measurements |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
item_level |
data.frame the data frame that contains metadata attributes of study data |
min_obs_in_subgroup |
integer (optional argument) If |
resolution |
numeric the maximum number of time points used for plotting the trend lines |
comparison_lines |
list type and style of lines with which trend
lines are to be compared. Can be mean +/- 0.5
standard deviation (the factor can be specified
differently in |
mark_time_points |
logical mark time points with observations (caution, there may be many marks) |
plot_observations |
logical show observations as scatter plot in the
background. If there are |
plot_format |
enum AUTO | COMBINED | FACETS | BOTH. Return the plot
as one combined plot for all groups or as
facet plots (one figure per group). |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
n_group_max |
integer maximum number of categories to be displayed
individually for the grouping variable ( |
enable_GAM |
logical Can LOESS computations be replaced by general additive models to reduce memory consumption for large datasets? |
exclude_constant_subgroups |
logical Should subgroups with constant values be excluded? |
min_bandwidth |
numeric lower limit for the LOESS bandwidth, should be greater than 0 and less than or equal to 1. In general, increasing the bandwidth leads to a smoother trend line. |
min_proportion |
numeric lower limit for the proportion of the smaller group (cases or controls) for creating a LOESS figure, should be greater than 0 and less than 0.4. |
If mark_time_points or plot_observations is selected, but would result in
plotting more than 400 points, only a sample of the data will be displayed.
Limitations
The application of LOESS requires model fitting, i.e. the smoothness
of a model is subject to a smoothing parameter (span).
Particularly in the presence of interval-based missing data, high
variability of measurements combined with a low number of
observations in one level of the group_vars may distort the fit.
Since our approach handles data without knowledge
of such underlying characteristics, finding the best fit is complicated if
computational costs should be minimal. The default of
LOESS in R uses a span of 0.75, which provides in most cases reasonable fits.
The function acc_loess adapts the span for each level of the group_vars
(with at least as many observations as specified in min_obs_in_subgroup
and with at least three time points) based on the respective
number of observations.
LOESS consumes a lot of memory for larger datasets. That is why acc_loess
switches to a generalized additive model with integrated smoothness
estimation (gam by mgcv) if there are 1000 observations or more for
at least one level of the group_vars (similar to geom_smooth
from ggplot2).
a list with:
SummaryPlotList: list with two plots if plot_format = "BOTH",
otherwise one of the two figures described below:
Loess_fits_facets: The plot contains LOESS-smoothed curves
for each level of the group_vars in a separate panel. Added trend
lines represent mean and standard deviation or quartiles (specified
in comparison_lines) for moving windows over the whole data.
Loess_fits_combined: This plot combines all curves into one
panel. Given a low number of levels in the group_vars, this plot
eases comparisons. However, if the number increases this plot may
be too crowded and unclear.
Mahalanobis distancesA standard tool to calculate Mahalanobis distance.
In this approach the squared Mahalanobis distance is calculated for ordinal
variables (treated as continuous) to identify inattentive responses.
It calculates the distance for each observational unit from the sample mean.
The greater the distance, the atypical the responses.
acc_mahalanobis( variable_group = NULL, study_data, item_level = "item_level", meta_data = item_level, meta_data_cross_item = "cross-item_level", label_col = VAR_NAMES, meta_data_v2, cross_item_level, `cross-item_level`, mahalanobis_threshold = suppressWarnings(as.numeric(getOption("dataquieR.MAHALANOBIS_THRESHOLD", dataquieR.MAHALANOBIS_THRESHOLD_default))) )acc_mahalanobis( variable_group = NULL, study_data, item_level = "item_level", meta_data = item_level, meta_data_cross_item = "cross-item_level", label_col = VAR_NAMES, meta_data_v2, cross_item_level, `cross-item_level`, mahalanobis_threshold = suppressWarnings(as.numeric(getOption("dataquieR.MAHALANOBIS_THRESHOLD", dataquieR.MAHALANOBIS_THRESHOLD_default))) )
variable_group |
variable list the names of the variables used to
calculate the |
study_data |
data.frame the data frame that contains the measurements |
item_level |
data.frame the data frame that contains metadata attributes of study data |
meta_data |
data.frame old name for |
meta_data_cross_item |
data.frame – Cross-item level metadata |
label_col |
variable attribute the name of the column in the metadata containing the labels of the variables |
meta_data_v2 |
character path or file name of the workbook like
metadata file, see
|
cross_item_level |
data.frame alias for |
`cross-item_level` |
data.frame alias for |
mahalanobis_threshold |
numeric the confidence level to use to define
|
a list with:
SummaryTable: data.frame underlying the plot
SummaryData: data.frame underlying the plot with speaking column labels
SummaryPlot: ggplot2::ggplot2 Q-Q plot of squared Mahalanobis
distances vs. a theoretical
chi-squared distribution showing outliers.
FlaggedStudyData: data.frame contains the original data frame of the
variables used to calculate
the squared Mahalanobis distances
with the additional column,
containing the squared
Mahalanobis distance, and a column
called MD_outliers, that contains
1 if the observational unit is considered
a multivariate outlier.
Implementation is restricted to variables of type integer
Remove missing codes from the study data (if defined in the metadata)
The covariance matrix is estimated for all variables from variable_group
The Mahalanobis distance of each observation is calculated
The default to consider a value an outlier is to use the 0.975 quantile
of a theoretical chi-square distribution with degrees of freedom
equals to the number of variables used to calculate the
Mahalanobis distance (Mayrhofer and Filzmoser, 2023)
Please use instead the function acc_mahalanobis()
acc_mahalanobis_ratio( resp_vars = NULL, study_data, label_col = VAR_NAMES, item_level = "item_level", meta_data = item_level, meta_data_v2, meta_data_cross_item = "cross-item_level", cross_item_level, `cross-item_level` )acc_mahalanobis_ratio( resp_vars = NULL, study_data, label_col = VAR_NAMES, item_level = "item_level", meta_data = item_level, meta_data_v2, meta_data_cross_item = "cross-item_level", cross_item_level, `cross-item_level` )
resp_vars |
variable the names of the computed variable
containing |
study_data |
data.frame the data frame that contains the measurements |
label_col |
variable attribute the name of the column in the metadata containing the labels of the variables |
item_level |
data.frame the data frame that contains metadata attributes of study data |
meta_data |
data.frame old name for |
meta_data_v2 |
character path or file name of the workbook like
metadata file, see
|
meta_data_cross_item |
data.frame – Cross-item level metadata |
cross_item_level |
data.frame alias for |
`cross-item_level` |
data.frame alias for |
a list with:
SummaryData: data.frame underlying the plot with user friendly caption
SummaryTable: data.frame underlying the plot
SummaryPlot: ggplot2::ggplot2 Q-Q plot of squared Mahalanobis
distances vs. a theoretical
chi-squared distribution showing outliers.
FlaggedStudyData data.frame contains the original data frame of the
variables used to calculate
the squared Mahalanobis distances
with an additional column indicating if
for a group of variables if the
observational unit is a
multivariate outlier.
Implementation is restricted to variables of type integer
Remove missing codes from the study data (if defined in the metadata)
The covariance matrix is estimated for all variables of resp_vars
The Mahalanobis distance of each observation is calculated
The default to consider a value an outlier is to use the 0.975 quantile
of a theoretical chi-square distribution with degrees of freedom
equals to the number of variables used to calculate the
Mahalanobis distance (Mayrhofer and Filzmoser, 2023)
This function examines the impact of so-called process variables on a measurement variable. This implementation combines a descriptive and a model-based approach. Process variables that can be considered in this implementation must be categorical. It is currently not possible to consider more than one process variable within one function call. The measurement variable can be adjusted for (multiple) covariables, such as age or sex, for example.
Marginal means rests on model-based results, i.e. a significantly different marginal mean depends on sample size. Particularly in large studies, small and irrelevant differences may become significant. The contrary holds if sample size is low.
acc_margins( resp_vars = NULL, group_vars = NULL, co_vars = NULL, study_data, label_col, item_level = "item_level", threshold_type = "empirical", threshold_value, min_obs_in_subgroup = 5, min_obs_in_cat = 5, dichotomize_categorical_resp = TRUE, cut_off_linear_model_for_ord = 10, meta_data = item_level, meta_data_v2, sort_group_var_levels = getOption("dataquieR.acc_margins_sort", dataquieR.acc_margins_sort_default), include_numbers_in_figures = getOption("dataquieR.acc_margins_num", dataquieR.acc_margins_num_default), n_violin_max = getOption("dataquieR.max_group_var_levels_with_violins", dataquieR.max_group_var_levels_with_violins_default), no_overall_in_bin = getOption("dataquieR.no_overall_in_bin", dataquieR.no_overall_in_bin_default), no_geom_count_in_bin = getOption("dataquieR.no_geom_count_in_bin", dataquieR.no_geom_count_in_bin_default) )acc_margins( resp_vars = NULL, group_vars = NULL, co_vars = NULL, study_data, label_col, item_level = "item_level", threshold_type = "empirical", threshold_value, min_obs_in_subgroup = 5, min_obs_in_cat = 5, dichotomize_categorical_resp = TRUE, cut_off_linear_model_for_ord = 10, meta_data = item_level, meta_data_v2, sort_group_var_levels = getOption("dataquieR.acc_margins_sort", dataquieR.acc_margins_sort_default), include_numbers_in_figures = getOption("dataquieR.acc_margins_num", dataquieR.acc_margins_num_default), n_violin_max = getOption("dataquieR.max_group_var_levels_with_violins", dataquieR.max_group_var_levels_with_violins_default), no_overall_in_bin = getOption("dataquieR.no_overall_in_bin", dataquieR.no_overall_in_bin_default), no_geom_count_in_bin = getOption("dataquieR.no_geom_count_in_bin", dataquieR.no_geom_count_in_bin_default) )
resp_vars |
variable the name of the measurement variable |
group_vars |
variable list len=1-1. the name of the observer, device or reader variable |
co_vars |
variable list a vector of covariables, e.g. age and sex for adjustment |
study_data |
data.frame the data frame that contains the measurements |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
item_level |
data.frame the data frame that contains metadata attributes of study data |
threshold_type |
enum empirical | user | none. In case |
threshold_value |
numeric a multiplier or absolute value (see
|
min_obs_in_subgroup |
integer from=0. This optional argument specifies
the minimum number of observations that is required to
include a subgroup (level) of the |
min_obs_in_cat |
integer This optional argument specifies the minimum
number of observations that is required to include
a category (level) of the outcome ( |
dichotomize_categorical_resp |
logical Should nominal response variables always be transformed to binary variables? |
cut_off_linear_model_for_ord |
integer from=0. This optional argument
specifies the minimum number of observations for
individual levels of an ordinal outcome ( |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
sort_group_var_levels |
logical Should the levels of the grouping variable be sorted descending by the number of observations? Note that ordinal grouping variables will not be reordered. |
include_numbers_in_figures |
logical Should the figure report the number of observations for each level of the grouping variable? |
n_violin_max |
integer from=0. This optional argument specifies
the maximum number of levels of the |
no_overall_in_bin |
logical Suppress overall distribution in 'margins' figures for binary outcomes |
no_geom_count_in_bin |
logical Suppress counts 'margins' figures for binary outcomes, so they . are not always including 0 and 1. |
Limitations
Selecting the appropriate distribution is complex. Dozens of continuous,
discrete or mixed distributions are conceivable in the context of
epidemiological data. Their exact exploration is beyond the scope of this
data quality approach. The present function uses the help function
util_dist_selection, the assigned SCALE_LEVEL and the DATA_TYPE
to discriminate the following cases:
continuous data
binary data
count data with <= 20 distinct values
count data with > 20 distinct values (treated as continuous)
nominal data
ordinal data
Continuous data and count data with more than 20 distinct values are analyzed
by linear models. Count data with up to 20 distinct values are modeled by a
Poisson regression. For binary data, the implementation uses logistic
regression.
Nominal response variables will either be transformed to binary variables or
analyzed by multinomial logistic regression models. The latter option is only
available if the argument dichotomize_categorical_resp is set to FALSE
and if the package nnet is installed. The transformation to a binary
variable can be user-specified using the metadata columns RECODE_CASES
and/or RECODE_CONTROL. Otherwise, the most frequent category will be
assigned to cases and the remaining categories to control.
For ordinal response variables, the argument cut_off_linear_model_for_ord
controls whether the data is analyzed in the same way as continuous data:
If every level of the variable has at least as many observations as specified
in the argument, the data will be analyzed by a linear model. Otherwise,
the data will be modeled by a ordered regression, if the package ordinal
is installed.
a list with:
SummaryTable: data.frame underlying the plot
ResultData: data.frame
SummaryPlot: ggplot2::ggplot() margins plot
A standard tool to detect multivariate outliers is the Mahalanobis distance. This approach is very helpful for the interpretation of the plausibility of a measurement given the value of another. In this approach the Mahalanobis distance is used as a univariate measure itself. We apply the same rules for the identification of outliers as in univariate outliers:
the classical approach from Tukey: from the
1st () or 3rd () quartile.
the 3SD approach, i.e. any measurement of the Mahalanobis
distance not in the interval of is considered an
outlier.
the approach from Hubert for skewed distributions which is embedded in the R package robustbase
a completely heuristic approach named -gap.
For further details, please see the vignette for univariate outlier.
acc_multivariate_outlier( variable_group = NULL, id_vars = NULL, label_col = VAR_NAMES, study_data, item_level = "item_level", n_rules = 4, max_non_outliers_plot = 10000, criteria = c("tukey", "3sd", "hubert", "sigmagap"), meta_data = item_level, meta_data_v2, scale = getOption("dataquieR.acc_multivariate_outlier.scale", dataquieR.acc_multivariate_outlier.scale_default), multivariate_outlier_check = TRUE )acc_multivariate_outlier( variable_group = NULL, id_vars = NULL, label_col = VAR_NAMES, study_data, item_level = "item_level", n_rules = 4, max_non_outliers_plot = 10000, criteria = c("tukey", "3sd", "hubert", "sigmagap"), meta_data = item_level, meta_data_v2, scale = getOption("dataquieR.acc_multivariate_outlier.scale", dataquieR.acc_multivariate_outlier.scale_default), multivariate_outlier_check = TRUE )
variable_group |
variable list the names of the continuous measurement variables building a group, for that multivariate outliers make sense. |
id_vars |
variable optional, an ID variable of the study data. If not specified row numbers are used. |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
study_data |
data.frame the data frame that contains the measurements |
item_level |
data.frame the data frame that contains metadata attributes of study data |
n_rules |
numeric from=1 to=4. the no. of rules that must be violated to classify as outlier |
max_non_outliers_plot |
integer from=0. Maximum number of non-outlier points to be plot. If more points exist, a subsample will be plotted only. Note, that sampling is not deterministic. |
criteria |
set tukey | 3SD | hubert | sigmagap. a vector with methods to be used for detecting outliers. |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
scale |
logical Should min-max-scaling be applied per variable? |
multivariate_outlier_check |
logical really check, pipeline use, only. |
a list with:
SummaryTable: data.frame underlying the plot
SummaryPlot: ggplot2::ggplot2 outlier plot
FlaggedStudyData data.frame contains the original data frame with
the additional columns tukey,
3SD,
hubert, and sigmagap. Every
observation
is coded 0 if no outlier was detected in
the respective column and 1 if an
outlier was detected. This can be used
to exclude observations with outliers.
Implementation is restricted to variables of type float
Remove missing codes from the study data (if defined in the metadata)
The covariance matrix is estimated for all variables from variable_group
The Mahalanobis distance of each observation is calculated
The four rules mentioned above are applied on this distance for each observation in the study data
An output data frame is generated that flags each outlier
A parallel coordinate plot indicates respective outliers
List function.
A classical but still popular approach to detect univariate outlier is the
boxplot method introduced by Tukey 1977. The boxplot is a simple graphical
tool to display information about continuous univariate data (e.g., median,
lower and upper quartile). Outliers are defined as values deviating more
than from the 1st (Q25) or 3rd (Q75) quartile. The
strength of Tukey's method is that it makes no distributional assumptions
and thus is also applicable to skewed or non mound-shaped data
Marsh and Seo, 2006. Nevertheless, this method tends to identify frequent
measurements which are falsely interpreted as true outliers.
A somewhat more conservative approach in terms of symmetric and/or normal
distributions is the 3SD approach, i.e. any measurement not in
the interval of is considered an outlier.
Both methods mentioned above are not ideally suited to skewed distributions.
As many biomarkers such as laboratory measurements represent in skewed
distributions the methods above may be insufficient. The approach of Hubert
and Vandervieren 2008 adjusts the boxplot for the skewness of the
distribution. This approach is implemented in several R packages such as
robustbase::mc which is used in this implementation of dataquieR.
Another completely heuristic approach is also included to identify outliers. The approach is based on the assumption that the distances between measurements of the same underlying distribution should homogeneous. For comprehension of this approach:
consider an ordered sequence of all measurements.
between these measurements all distances are calculated.
the occurrence of larger distances between two neighboring measurements
may
than indicate a distortion of the data. For the heuristic definition of a
large distance has been been chosen.
Note, that the plots are not deterministic, because they use ggplot2::geom_jitter.
acc_robust_univariate_outlier( resp_vars = NULL, study_data, label_col, item_level = "item_level", exclude_roles, n_rules = length(unique(criteria)), max_non_outliers_plot = 10000, criteria = c("tukey", "3sd", "hubert", "sigmagap"), meta_data = item_level, meta_data_v2 )acc_robust_univariate_outlier( resp_vars = NULL, study_data, label_col, item_level = "item_level", exclude_roles, n_rules = length(unique(criteria)), max_non_outliers_plot = 10000, criteria = c("tukey", "3sd", "hubert", "sigmagap"), meta_data = item_level, meta_data_v2 )
resp_vars |
variable list the name of the continuous measurement variable |
study_data |
data.frame the data frame that contains the measurements |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
item_level |
data.frame the data frame that contains metadata attributes of study data |
exclude_roles |
variable roles a character (vector) of variable roles not included |
n_rules |
integer from=1 to=4. the no. rules that must be violated to flag a variable as containing outliers. The default is 4, i.e. all. |
max_non_outliers_plot |
integer from=0. Maximum number of non-outlier points to be plot. If more points exist, a subsample will be plotted only. Note, that sampling is not deterministic. |
criteria |
set tukey | 3SD | hubert | sigmagap. a vector with methods to be used for detecting outliers. |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
Hint: The function is designed for unimodal data only.
a list with:
SummaryTable: data.frame with the columns
Variables, Mean, SD, Median, Skewness, Tukey (N),
3SD (N), Hubert (N), Sigma-gap (N), NUM_acc_ud_outlu,
Outliers, low (N), Outliers, high (N) Grading
SummaryData: data.frame with the columns
Variables, Mean, SD, Median, Skewness, Tukey (N),
3SD (N), Hubert (N), Sigma-gap (N), Outliers (N),
Outliers, low (N), Outliers, high (N)
SummaryPlotList: ggplot2::ggplot univariate outlier plots
Select all variables of type float in the study data
Remove missing codes from the study data (if defined in the metadata)
Remove measurements deviating from limits defined in the metadata
Identify outliers according to the approaches of Tukey (Tukey 1977), 3SD (Saleem et al. 2021), Hubert (Hubert and Vandervieren 2008), and SigmaGap (heuristic)
An output data frame is generated which indicates the no. possible outliers, the direction of deviations (Outliers, low; Outliers, high) for all methods and a summary score which sums up the deviations of the different rules
A scatter plot is generated for all examined variables, flagging observations according to the no. violated rules (step 5).
This implementation contrasts the empirical distribution of a measurement variables against assumed distributions. The approach is adapted from the idea of rootograms (Tukey 1977) which is also applicable for count data (Kleiber and Zeileis 2016).
acc_shape_or_scale( resp_vars, study_data, label_col, item_level = "item_level", dist_col, guess, par1, par2, end_digits, flip_mode = "noflip", meta_data = item_level, meta_data_v2 )acc_shape_or_scale( resp_vars, study_data, label_col, item_level = "item_level", dist_col, guess, par1, par2, end_digits, flip_mode = "noflip", meta_data = item_level, meta_data_v2 )
resp_vars |
variable the name of the continuous measurement variable |
study_data |
data.frame the data frame that contains the measurements |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
item_level |
data.frame the data frame that contains metadata attributes of study data |
dist_col |
variable attribute the name of the variable attribute in meta_data that provides the expected distribution of a study variable |
guess |
logical estimate parameters |
par1 |
numeric first parameter of the distribution if applicable |
par2 |
numeric second parameter of the distribution if applicable |
end_digits |
logical internal use. check for end digits preferences |
flip_mode |
enum default | flip | noflip | auto. Should the plot be
in default orientation, flipped, not flipped or
auto-flipped. Not all options are always supported.
In general, this con be controlled by
setting the |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
a list with:
ResultData: data.frame underlying the plot
SummaryPlot: ggplot2::ggplot2 probability distribution plot
SummaryTable: data.frame with the columns Variables and FLG_acc_ud_shape
This implementation is restricted to data of type float or integer.
Missing codes are removed from resp_vars (if defined in the metadata)
The user must specify the column of the metadata containing probability distribution (currently only: normal, uniform, gamma)
Parameters of each distribution can be estimated from the data or are specified by the user
A histogram-like plot contrasts the empirical vs. the technical distribution
A classical but still popular approach to detect univariate outlier is the
boxplot method introduced by Tukey 1977. The boxplot is a simple graphical
tool to display information about continuous univariate data (e.g., median,
lower and upper quartile). Outliers are defined as values deviating more
than from the 1st (Q25) or 3rd (Q75) quartile. The
strength of Tukey's method is that it makes no distributional assumptions
and thus is also applicable to skewed or non mound-shaped data
Marsh and Seo, 2006. Nevertheless, this method tends to identify frequent
measurements which are falsely interpreted as true outliers.
A somewhat more conservative approach in terms of symmetric and/or normal
distributions is the 3SD approach, i.e. any measurement not in
the interval of is considered an outlier.
Both methods mentioned above are not ideally suited to skewed distributions.
As many biomarkers such as laboratory measurements represent in skewed
distributions the methods above may be insufficient. The approach of Hubert
and Vandervieren 2008 adjusts the boxplot for the skewness of the
distribution. This approach is implemented in several R packages such as
robustbase::mc which is used in this implementation of dataquieR.
Another completely heuristic approach is also included to identify outliers. The approach is based on the assumption that the distances between measurements of the same underlying distribution should homogeneous. For comprehension of this approach:
consider an ordered sequence of all measurements.
between these measurements all distances are calculated.
the occurrence of larger distances between two neighboring measurements
may
than indicate a distortion of the data. For the heuristic definition of a
large distance has been been chosen.
Note, that the plots are not deterministic, because they use ggplot2::geom_jitter.
acc_univariate_outlier( resp_vars = NULL, study_data, label_col, item_level = "item_level", exclude_roles, n_rules = length(unique(criteria)), max_non_outliers_plot = 10000, criteria = c("tukey", "3sd", "hubert", "sigmagap"), meta_data = item_level, meta_data_v2 )acc_univariate_outlier( resp_vars = NULL, study_data, label_col, item_level = "item_level", exclude_roles, n_rules = length(unique(criteria)), max_non_outliers_plot = 10000, criteria = c("tukey", "3sd", "hubert", "sigmagap"), meta_data = item_level, meta_data_v2 )
resp_vars |
variable list the name of the continuous measurement variable |
study_data |
data.frame the data frame that contains the measurements |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
item_level |
data.frame the data frame that contains metadata attributes of study data |
exclude_roles |
variable roles a character (vector) of variable roles not included |
n_rules |
integer from=1 to=4. the no. rules that must be violated to flag a variable as containing outliers. The default is 4, i.e. all. |
max_non_outliers_plot |
integer from=0. Maximum number of non-outlier points to be plot. If more points exist, a subsample will be plotted only. Note, that sampling is not deterministic. |
criteria |
set tukey | 3SD | hubert | sigmagap. a vector with methods to be used for detecting outliers. |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
Hint: The function is designed for unimodal data only.
a list with:
SummaryTable: data.frame with the columns
Variables, Mean, SD, Median, Skewness, Tukey (N),
3SD (N), Hubert (N), Sigma-gap (N), NUM_acc_ud_outlu,
Outliers, low (N), Outliers, high (N) Grading
SummaryData: data.frame with the columns
Variables, Mean, SD, Median, Skewness, Tukey (N),
3SD (N), Hubert (N), Sigma-gap (N), Outliers (N),
Outliers, low (N), Outliers, high (N)
SummaryPlotList: ggplot2::ggplot univariate outlier plots
Select all variables of type float in the study data
Remove missing codes from the study data (if defined in the metadata)
Remove measurements deviating from limits defined in the metadata
Identify outliers according to the approaches of Tukey (Tukey 1977), 3SD (Saleem et al. 2021), Hubert (Hubert and Vandervieren 2008), and SigmaGap (heuristic)
An output data frame is generated which indicates the no. possible outliers, the direction of deviations (Outliers, low; Outliers, high) for all methods and a summary score which sums up the deviations of the different rules
A scatter plot is generated for all examined variables, flagging observations according to the no. violated rules (step 5).
This function is still under construction. It is designed to run for any statistical data type as follows:
Variables with only two distinct values will be modeled by mixed effects logistic regression.
Nominal variables will be transformed to binary variables. This can be
user-specified using the metadata columns RECODE_CASES and/or
RECODE_CONTROL. Otherwise, the most frequent category will be assigned
to cases and the remaining categories to control. As for other binary
variables, the ICC will be computed using a mixed effects logistic
regression.
Ordinal variables will be analyzed by linear mixed effects models, if
every level of the variable has at least as many observations as
specified in the argument cut_off_linear_model_for_ord. Otherwise, the
data will be modeled by a mixed effects ordered regression, if the
package ordinal is available.
Metric variables with integer values are analyzed by linear mixed effects models.
For variables with data type float, the existing implementation
acc_varcomp is called, which also uses linear mixed effects models.
acc_varcomp( resp_vars = NULL, group_vars = NULL, co_vars = NULL, study_data, label_col, item_level = "item_level", min_obs_in_subgroup = 10, min_subgroups = 5, cut_off_linear_model_for_ord = 10, threshold_value = lifecycle::deprecated(), meta_data = item_level, meta_data_v2 )acc_varcomp( resp_vars = NULL, group_vars = NULL, co_vars = NULL, study_data, label_col, item_level = "item_level", min_obs_in_subgroup = 10, min_subgroups = 5, cut_off_linear_model_for_ord = 10, threshold_value = lifecycle::deprecated(), meta_data = item_level, meta_data_v2 )
resp_vars |
variable the name of the measurement variable |
group_vars |
variable the name of the examiner, device or reader variable |
co_vars |
variable list a vector of covariables, e.g. age and sex, for adjustment |
study_data |
data.frame the data frame that contains the measurements |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
item_level |
data.frame the data frame that contains metadata attributes of study data |
min_obs_in_subgroup |
integer from=0. This optional argument specifies
the minimum number of observations that is
required to include a subgroup (level) of the
|
min_subgroups |
integer from=0. This optional argument specifies
the minimum number of subgroups (level) of the
|
cut_off_linear_model_for_ord |
integer from=0. This optional argument
specifies the minimum number of observations for
individual levels of an ordinal outcome
( |
threshold_value |
Deprecated. |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
Not yet described
The function returns two data frames, 'SummaryTable' and 'SummaryData', that differ only in the names of the columns.
as.character implementation for the class dataquieR_translated
dataquieR's translated texts featuring access to the language keys, still.
## S3 method for class 'dataquieR_translated' as.character(x, ...)## S3 method for class 'dataquieR_translated' as.character(x, ...)
x |
|
... |
passed to base::as.character |
character with only the translated entries
base::as.character
as.character implementation for the class interval
such objects, for now, only occur in RECCap rules, so this function
is meant for internal use, mostly – for now.
## S3 method for class 'interval' as.character(x, ...)## S3 method for class 'interval' as.character(x, ...)
x |
|
... |
not used yet |
interval as character
base::as.character
dataquieR report to a data.frame
Deprecated
## S3 method for class 'dataquieR_resultset' as.data.frame(x, ...)## S3 method for class 'dataquieR_resultset' as.data.frame(x, ...)
x |
Deprecated |
... |
Deprecated |
Deprecated
dataquieR report to a list
Deprecated
## S3 method for class 'dataquieR_resultset' as.list(x, ...)## S3 method for class 'dataquieR_resultset' as.list(x, ...)
x |
Deprecated |
... |
Deprecated |
Deprecated
prep_set_backend()
inefficient way to convert a report to a list. try prep_set_backend()
## S3 method for class 'dataquieR_resultset2' as.list(x, ...)## S3 method for class 'dataquieR_resultset2' as.list(x, ...)
x |
|
... |
no used |
The allowable direction of an association. The input is a string that can be either "positive" or "negative".
ASSOCIATION_DIRECTIONASSOCIATION_DIRECTION
Other meta_data_cross:
ASSOCIATION_FORM,
ASSOCIATION_METRIC,
ASSOCIATION_RANGE,
CHECK_ID,
CHECK_LABEL,
COMPUTED_VARIABLE_ROLES,
CONTRADICTION_TERM,
CONTRADICTION_TYPE,
DATA_PREPARATION,
GOLDSTANDARD,
IRV,
MAHALANOBIS_RATIO,
MAHALANOBIS_THRESHOLD,
MAXIMUM_LONG_STRING,
MISS_RESP,
MULTIVARIATE_OUTLIER_CHECK,
MULTIVARIATE_OUTLIER_CHECKTYPE,
RELCOMPL_SPEED,
REL_VAL,
RESPT_PER_ITEM,
SCALE_ACRONYM,
SCALE_NAME,
TOTRESPT,
VARIABLE_LIST,
VARIABLE_LIST_ORDER,
meta_data_computation,
meta_data_cross
The allowable form of association. The string specifies the form based on a selected list.
ASSOCIATION_FORMASSOCIATION_FORM
Other meta_data_cross:
ASSOCIATION_DIRECTION,
ASSOCIATION_METRIC,
ASSOCIATION_RANGE,
CHECK_ID,
CHECK_LABEL,
COMPUTED_VARIABLE_ROLES,
CONTRADICTION_TERM,
CONTRADICTION_TYPE,
DATA_PREPARATION,
GOLDSTANDARD,
IRV,
MAHALANOBIS_RATIO,
MAHALANOBIS_THRESHOLD,
MAXIMUM_LONG_STRING,
MISS_RESP,
MULTIVARIATE_OUTLIER_CHECK,
MULTIVARIATE_OUTLIER_CHECKTYPE,
RELCOMPL_SPEED,
REL_VAL,
RESPT_PER_ITEM,
SCALE_ACRONYM,
SCALE_NAME,
TOTRESPT,
VARIABLE_LIST,
VARIABLE_LIST_ORDER,
meta_data_computation,
meta_data_cross
The metric underlying the association in ASSOCIATION_RANGE. The input is a string that specifies the analysis algorithm to be used.
ASSOCIATION_METRICASSOCIATION_METRIC
Other meta_data_cross:
ASSOCIATION_DIRECTION,
ASSOCIATION_FORM,
ASSOCIATION_RANGE,
CHECK_ID,
CHECK_LABEL,
COMPUTED_VARIABLE_ROLES,
CONTRADICTION_TERM,
CONTRADICTION_TYPE,
DATA_PREPARATION,
GOLDSTANDARD,
IRV,
MAHALANOBIS_RATIO,
MAHALANOBIS_THRESHOLD,
MAXIMUM_LONG_STRING,
MISS_RESP,
MULTIVARIATE_OUTLIER_CHECK,
MULTIVARIATE_OUTLIER_CHECKTYPE,
RELCOMPL_SPEED,
REL_VAL,
RESPT_PER_ITEM,
SCALE_ACRONYM,
SCALE_NAME,
TOTRESPT,
VARIABLE_LIST,
VARIABLE_LIST_ORDER,
meta_data_computation,
meta_data_cross
Specifies the allowable range of an association. The inclusion of the endpoints follows standard mathematical notation using round brackets for open intervals and square brackets for closed intervals. Values must be separated by a semicolon.
ASSOCIATION_RANGEASSOCIATION_RANGE
Other meta_data_cross:
ASSOCIATION_DIRECTION,
ASSOCIATION_FORM,
ASSOCIATION_METRIC,
CHECK_ID,
CHECK_LABEL,
COMPUTED_VARIABLE_ROLES,
CONTRADICTION_TERM,
CONTRADICTION_TYPE,
DATA_PREPARATION,
GOLDSTANDARD,
IRV,
MAHALANOBIS_RATIO,
MAHALANOBIS_THRESHOLD,
MAXIMUM_LONG_STRING,
MISS_RESP,
MULTIVARIATE_OUTLIER_CHECK,
MULTIVARIATE_OUTLIER_CHECKTYPE,
RELCOMPL_SPEED,
REL_VAL,
RESPT_PER_ITEM,
SCALE_ACRONYM,
SCALE_NAME,
TOTRESPT,
VARIABLE_LIST,
VARIABLE_LIST_ORDER,
meta_data_computation,
meta_data_cross
Specifies the unique IDs for cross-item level metadata records
CHECK_IDCHECK_ID
if missing, dataquieR will create such IDs
Other meta_data_cross:
ASSOCIATION_DIRECTION,
ASSOCIATION_FORM,
ASSOCIATION_METRIC,
ASSOCIATION_RANGE,
CHECK_LABEL,
COMPUTED_VARIABLE_ROLES,
CONTRADICTION_TERM,
CONTRADICTION_TYPE,
DATA_PREPARATION,
GOLDSTANDARD,
IRV,
MAHALANOBIS_RATIO,
MAHALANOBIS_THRESHOLD,
MAXIMUM_LONG_STRING,
MISS_RESP,
MULTIVARIATE_OUTLIER_CHECK,
MULTIVARIATE_OUTLIER_CHECKTYPE,
RELCOMPL_SPEED,
REL_VAL,
RESPT_PER_ITEM,
SCALE_ACRONYM,
SCALE_NAME,
TOTRESPT,
VARIABLE_LIST,
VARIABLE_LIST_ORDER,
meta_data_computation,
meta_data_cross
Specifies the unique labels for cross-item level metadata records
CHECK_LABELCHECK_LABEL
if missing, dataquieR will create such labels
Other meta_data_cross:
ASSOCIATION_DIRECTION,
ASSOCIATION_FORM,
ASSOCIATION_METRIC,
ASSOCIATION_RANGE,
CHECK_ID,
COMPUTED_VARIABLE_ROLES,
CONTRADICTION_TERM,
CONTRADICTION_TYPE,
DATA_PREPARATION,
GOLDSTANDARD,
IRV,
MAHALANOBIS_RATIO,
MAHALANOBIS_THRESHOLD,
MAXIMUM_LONG_STRING,
MISS_RESP,
MULTIVARIATE_OUTLIER_CHECK,
MULTIVARIATE_OUTLIER_CHECKTYPE,
RELCOMPL_SPEED,
REL_VAL,
RESPT_PER_ITEM,
SCALE_ACRONYM,
SCALE_NAME,
TOTRESPT,
VARIABLE_LIST,
VARIABLE_LIST_ORDER,
meta_data_computation,
meta_data_cross
Two versions exist, the newer one is used by con_contradictions_redcap and is described here., the older one used by con_contradictions is described here.
Default Name of the Table featuring Code Lists
Metadata sheet name containing VALUE_LABEL_TABLES This metadata sheet can contain both value labels of several VALUE_LABEL_TABLE and also Missing and JUMP tables
CODE_LIST_TABLE CODE_LIST_TABLECODE_LIST_TABLE CODE_LIST_TABLE
Only existence is checked, order not yet used
CODE_ORDERCODE_ORDER
Item-Missingness (also referred to as item nonresponse (De Leeuw et al. 2003)) describes the missingness of single values, e.g. blanks or empty data cells in a data set. Item-Missingness occurs for example in case a respondent does not provide information for a certain question, a question is overlooked by accident, a programming failure occurs or a provided answer were missed while entering the data.
com_item_missingness( resp_vars = NULL, study_data, label_col, item_level = "item_level", show_causes = TRUE, cause_label_df, include_sysmiss = TRUE, threshold_value, suppressWarnings = FALSE, assume_consistent_codes = TRUE, expand_codes = assume_consistent_codes, drop_levels = FALSE, expected_observations = c("HIERARCHY", "ALL", "SEGMENT"), pretty_print = lifecycle::deprecated(), meta_data = item_level, meta_data_v2 )com_item_missingness( resp_vars = NULL, study_data, label_col, item_level = "item_level", show_causes = TRUE, cause_label_df, include_sysmiss = TRUE, threshold_value, suppressWarnings = FALSE, assume_consistent_codes = TRUE, expand_codes = assume_consistent_codes, drop_levels = FALSE, expected_observations = c("HIERARCHY", "ALL", "SEGMENT"), pretty_print = lifecycle::deprecated(), meta_data = item_level, meta_data_v2 )
resp_vars |
variable list the name of the measurement variables |
study_data |
data.frame the data frame that contains the measurements |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
item_level |
data.frame the data frame that contains metadata attributes of study data |
show_causes |
logical if TRUE, then the distribution of missing codes is shown |
cause_label_df |
data.frame missing code table. If missing codes have labels the respective data frame can be specified here or in the metadata as assignments, see cause_label_df |
include_sysmiss |
logical Optional, if TRUE system missingness (NAs) is evaluated in the summary plot |
threshold_value |
numeric from=0 to=100. a numerical value ranging from 0-100 |
suppressWarnings |
logical warn about consistency issues with missing and jump lists |
assume_consistent_codes |
logical if TRUE and no labels are given and the same missing/jump code is used for more than one variable, the labels assigned for this code are treated as being be the same for all variables. |
expand_codes |
logical if TRUE, code labels are copied from other variables, if the code is the same and the label is set somewhere |
drop_levels |
logical if TRUE, do not display unused missing codes in the figure legend. |
expected_observations |
enum HIERARCHY | ALL | SEGMENT. If ALL, all
observations are expected to comprise
all study segments. If SEGMENT, the
|
pretty_print |
logical deprecated. If you want to have a human
readable output, use |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
a list with:
SummaryTable: data frame about item missingness per response variable
SummaryData: data frame about item missingness per response variable
formatted for user
SummaryPlot: ggplot2 heatmap plot, if show_causes was TRUE
ReportSummaryTable: data frame underlying SummaryPlot
Lists of missing codes and, if applicable, jump codes are selected from the metadata
The no. of system missings (NA) in each variable is calculated
The no. of used missing codes is calculated for each variable
The no. of used jump codes is calculated for each variable
Two result dataframes (1: on the level of observations, 2: a summary for each variable) are generated
OPTIONAL: if show_causes is selected, one summary plot for all
resp_vars is provided
com_qualified_item_missingness( resp_vars, study_data, label_col = NULL, item_level = "item_level", expected_observations = c("HIERARCHY", "ALL", "SEGMENT"), meta_data = item_level, meta_data_v2, meta_data_segment, segment_level )com_qualified_item_missingness( resp_vars, study_data, label_col = NULL, item_level = "item_level", expected_observations = c("HIERARCHY", "ALL", "SEGMENT"), meta_data = item_level, meta_data_v2, meta_data_segment, segment_level )
resp_vars |
variable list the name of the measurement variables |
study_data |
data.frame the data frame that contains the measurements |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
item_level |
data.frame the data frame that contains metadata attributes of study data |
expected_observations |
enum HIERARCHY | ALL | SEGMENT. Report the
number of observations expected using
the old |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
meta_data_segment |
data.frame – optional: Segment level metadata |
segment_level |
data.frame alias for |
A list with:
SummaryTable: data.frame containing data quality checks for
"Non-response rate" (PCT_com_qum_nonresp) and
"Refusal rate" (PCT_com_qum_refusal) for each response
variable in resp_vars.
SummaryData: a data.frame containing data quality checks for
“Non-response rate” and "Refusal rate"
for a report
com_qualified_segment_missingness( label_col = NULL, study_data, item_level = "item_level", expected_observations = c("HIERARCHY", "ALL", "SEGMENT"), meta_data = item_level, meta_data_v2, meta_data_segment, segment_level )com_qualified_segment_missingness( label_col = NULL, study_data, item_level = "item_level", expected_observations = c("HIERARCHY", "ALL", "SEGMENT"), meta_data = item_level, meta_data_v2, meta_data_segment, segment_level )
label_col |
variable attribute the name of the column in the metadata with labels of variables |
study_data |
data.frame the data frame that contains the measurements |
item_level |
data.frame the data frame that contains metadata attributes of study data |
expected_observations |
enum HIERARCHY | ALL | SEGMENT. Report the
number of observations expected using
the old |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
meta_data_segment |
data.frame Segment level metadata |
segment_level |
data.frame alias for |
A list with:
SegmentTable: data.frame containing data quality checks for
"Non-response rate" (PCT_com_qum_nonresp) and
"Refusal rate" (PCT_com_qum_refusal) for each segment.
SegmentData: a data.frame containing data quality checks for
"Unexpected location" and "Unexpected proportion" per
segment for a report
participation in study segments is not recorded by respective variables, e.g. a participant's refusal to attend a specific examination is not recorded.
participation in study segments is recorded by respective variables.
Use case (1) will be common in smaller studies. For the calculation of segment missingness it is assumed that study variables are nested in respective segments. This structure must be specified in the static metadata. The R-function identifies all variables within each segment and returns TRUE if all variables within a segment are missing, otherwise FALSE.
Use case (2) assumes a more complex structure of study data and metadata.
The study data comprise so-called intro-variables (either TRUE/FALSE or codes
for non-participation). The column PART_VAR in the metadata is
filled by variable-IDs indicating for each variable the respective
intro-variable. This structure has the benefit that subsequent calculation of
item missingness obtains correct denominators for the calculation of
missingness rates.
com_segment_missingness( study_data, item_level = "item_level", strata_vars = NULL, group_vars = NULL, label_col, threshold_value, direction, color_gradient_direction, expected_observations = c("HIERARCHY", "ALL", "SEGMENT"), exclude_roles = c(VARIABLE_ROLES$PROCESS), meta_data = item_level, meta_data_v2, segment_level, meta_data_segment )com_segment_missingness( study_data, item_level = "item_level", strata_vars = NULL, group_vars = NULL, label_col, threshold_value, direction, color_gradient_direction, expected_observations = c("HIERARCHY", "ALL", "SEGMENT"), exclude_roles = c(VARIABLE_ROLES$PROCESS), meta_data = item_level, meta_data_v2, segment_level, meta_data_segment )
study_data |
data.frame the data frame that contains the measurements |
item_level |
data.frame the data frame that contains metadata attributes of study data |
strata_vars |
variable the name of a variable used for stratification, defaults to NULL for not grouping output |
group_vars |
variable the name of a variable used for grouping, defaults to NULL for not grouping output |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
threshold_value |
numeric from=0 to=100. a numerical value ranging from 0-100 |
direction |
enum low | high. "high" or "low", i.e. are deviations above/below the threshold critical. This argument is deprecated and replaced by color_gradient_direction. |
color_gradient_direction |
enum above | below. "above" or "below", i.e. are deviations above or below the threshold critical? (default: above) |
expected_observations |
enum HIERARCHY | ALL | SEGMENT. If ALL, all
observations are expected to comprise
all study segments. If SEGMENT, the
|
exclude_roles |
variable roles a character (vector) of variable roles not included |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
segment_level |
data.frame alias for |
meta_data_segment |
data.frame Segment level metadata. Optional. |
This implementation uses one threshold to discriminate critical from non-critical values. If direction is above than all values below the threshold_value are normal (displayed in dark blue in the plot and flagged with GRADING = 0 in the dataframe). All values above the threshold_value are considered critical. The more they deviate from the threshold the displayed color shifts to dark red. All critical values are highlighted with GRADING = 1 in the summary data frame. By default, highest values are always shown in dark red irrespective of the absolute deviation.
If direction is below than all values above the threshold_value are normal (displayed in dark blue, GRADING = 0).
This function does not support a resp_vars argument but exclude_roles to
specify variables not relevant for detecting a missing segment.
List function.
a list with:
ResultData: data frame about segment missingness
SummaryPlot: ggplot2 heatmap plot: a heatmap-like graphic that
highlights critical values depending on the respective
threshold_value and direction.
ReportSummaryTable: data frame underlying SummaryPlot
This implementation examines a crude version of unit missingness or unit-nonresponse (Kalton and Kasprzyk 1986), i.e. if all measurement variables in the study data are missing for an observation it has unit missingness.
The function can be applied on stratified data. In this case strata_vars must be specified.
com_unit_missingness( id_vars = NULL, strata_vars = NULL, label_col, study_data, item_level = "item_level", meta_data = item_level, meta_data_v2 )com_unit_missingness( id_vars = NULL, strata_vars = NULL, label_col, study_data, item_level = "item_level", meta_data = item_level, meta_data_v2 )
id_vars |
variable list optional, a (vectorized) call of ID-variables that should not be considered in the calculation of unit- missingness |
strata_vars |
variable optional, a string or integer variable used for stratification |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
study_data |
data.frame the data frame that contains the measurements |
item_level |
data.frame the data frame that contains metadata attributes of study data |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
This implementations calculates a crude rate of unit-missingness. This type of missingness may have several causes and is an important research outcome. For example, unit-nonresponse may be selective regarding the targeted study population or technical reasons such as record-linkage may cause unit-missingness.
It has to be discriminated form segment and item missingness, since different causes and mechanisms may be the reason for unit-missingness.
This function does not support a resp_vars argument but id_vars, which
have a roughly inverse logic behind: id_vars with values do not prevent a row
from being considered missing, because an ID is the only hint for a unit that
elsewise would not occur in the data at all.
List function.
A list with:
FlaggedStudyData: data.frame with id-only-rows flagged in a column
Unit_missing
SummaryData: data.frame with numbers and percentages of unit
missingness
Cross-item level metadata attribute name
COMPUTATION_RULECOMPUTATION_RULE
SSI related Cross-item level metadata attribute names
Computed Variable roles can be one of the following:MAXIMUM_LONG_STRING Social Science: Computed Indicator Variable,
maximum long string
IRV Social Science: Computed Indicator Variable, IRV
TOTRESPT Social Science: Computed Indicator Variable, TOTRESPT
RESPT_PER_ITEM Social Science: Computed Indicator Variable, RESPT_PER_ITEM
RELCOMPL_SPEED Social Science: Computed Indicator Variable, RELCOMPL_SPEED
MISS_RESP Social Science: Computed Indicator Variable, MISS_RESP
NA Social Science: Computed Indicator Variable – N/A
Other meta_data_cross:
ASSOCIATION_DIRECTION,
ASSOCIATION_FORM,
ASSOCIATION_METRIC,
ASSOCIATION_RANGE,
CHECK_ID,
CHECK_LABEL,
CONTRADICTION_TERM,
CONTRADICTION_TYPE,
DATA_PREPARATION,
GOLDSTANDARD,
IRV,
MAHALANOBIS_RATIO,
MAHALANOBIS_THRESHOLD,
MAXIMUM_LONG_STRING,
MISS_RESP,
MULTIVARIATE_OUTLIER_CHECK,
MULTIVARIATE_OUTLIER_CHECKTYPE,
RELCOMPL_SPEED,
REL_VAL,
RESPT_PER_ITEM,
SCALE_ACRONYM,
SCALE_NAME,
TOTRESPT,
VARIABLE_LIST,
VARIABLE_LIST_ORDER,
meta_data_computation,
meta_data_cross
Other SSI:
IRV,
MAHALANOBIS_RATIO,
MAXIMUM_LONG_STRING,
MISS_RESP,
RELCOMPL_SPEED,
RESPT_PER_ITEM,
TOTRESPT
This approach considers a contradiction if impossible combinations of data are observed in one participant. For example, if age of a participant is recorded repeatedly the value of age is (unfortunately) not able to decline. Most cases of contradictions rest on comparison of two variables.
Important to note, each value that is used for comparison may represent a possible characteristic but the combination of these two values is considered to be impossible. The approach does not consider implausible or inadmissible values.
con_contradictions( resp_vars = NULL, study_data, label_col, item_level = "item_level", threshold_value, check_table, summarize_categories = FALSE, meta_data = item_level, meta_data_v2 )con_contradictions( resp_vars = NULL, study_data, label_col, item_level = "item_level", threshold_value, check_table, summarize_categories = FALSE, meta_data = item_level, meta_data_v2 )
resp_vars |
variable list the name of the measurement variables |
study_data |
data.frame the data frame that contains the measurements |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
item_level |
data.frame the data frame that contains metadata attributes of study data |
threshold_value |
numeric from=0 to=100. a numerical value ranging from 0-100 |
check_table |
data.frame contradiction rules table. Table defining contradictions. See details for its required structure. |
summarize_categories |
logical Needs a column 'tag' in the
|
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
Select all variables in the data with defined contradiction rules (static metadata column CONTRADICTIONS)
Remove missing codes from the study data (if defined in the metadata)
Remove measurements deviating from limits defined in the metadata
Assign label to levels of categorical variables (if applicable)
Apply contradiction checks on predefined sets of variables
Identification of measurements fulfilling contradiction rules. Therefore two output data frames are generated:
on the level of observation to flag each contradictory value combination, and
a summary table for each contradiction check.
A summary plot illustrating the number of contradictions is generated.
List function.
If summarize_categories is FALSE:
A list with:
FlaggedStudyData: The first output of the contradiction function is a
data frame of similar dimension regarding the number
of observations in the study data. In addition, for
each applied check on the variables an additional
column is added which flags observations with a
contradiction given the applied check.
SummaryTable: The second output summarizes this information into one
data frame. This output can be used to provide an
executive overview on the amount of contradictions. This
output is meant for automatic digestion within pipelines.
SummaryData: The third output is the same as SummaryTable but for
human readers.
SummaryPlot: The fourth output visualizes summarized information
of SummaryData.
if summarize_categories is TRUE, other objects are returned:
one per category named by that category (e.g. "Empirical") containing a
result for contradictions within that category only. Additionally, in the
slot all_checks a result as it would have been returned with
summarize_categories set to FALSE. Finally, a slot SummaryData is
returned containing sums per Category and an according ggplot2::ggplot in
SummaryPlot.
This approach considers a contradiction if impossible combinations of data are observed in one participant. For example, if age of a participant is recorded repeatedly the value of age is (unfortunately) not able to decline. Most cases of contradictions rest on comparison of two variables.
Important to note, each value that is used for comparison may represent a possible characteristic but the combination of these two values is considered to be impossible. The approach does not consider implausible or inadmissible values.
con_contradictions_redcap( study_data, item_level = "item_level", label_col, threshold_value, meta_data_cross_item = "cross-item_level", use_value_labels, summarize_categories = FALSE, meta_data = item_level, cross_item_level, `cross-item_level`, meta_data_v2 )con_contradictions_redcap( study_data, item_level = "item_level", label_col, threshold_value, meta_data_cross_item = "cross-item_level", use_value_labels, summarize_categories = FALSE, meta_data = item_level, cross_item_level, `cross-item_level`, meta_data_v2 )
study_data |
data.frame the data frame that contains the measurements |
item_level |
data.frame the data frame that contains metadata attributes of study data |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
threshold_value |
numeric from=0 to=100. a numerical value ranging from 0-100 |
meta_data_cross_item |
data.frame contradiction rules table. Table defining contradictions. See online documentation for its required structure. |
use_value_labels |
logical Deprecated in favor of DATA_PREPARATION.
If set to |
summarize_categories |
logical Needs a column |
meta_data |
data.frame old name for |
cross_item_level |
data.frame alias for |
`cross-item_level` |
data.frame alias for |
meta_data_v2 |
character path to workbook like metadata file, see
|
Remove missing codes from the study data (if defined in the metadata)
Remove measurements deviating from limits defined in the metadata
Assign label to levels of categorical variables (if applicable)
Apply contradiction checks (given as REDCap-like rules in a separate
metadata table)
Identification of measurements fulfilling contradiction rules. Therefore two output data frames are generated:
on the level of observation to flag each contradictory value combination, and
a summary table for each contradiction check.
A summary plot illustrating the number of contradictions is generated.
List function.
If summarize_categories is FALSE:
A list with:
FlaggedStudyData: The first output of the contradiction function is a
data frame of similar dimension regarding the number
of observations in the study data. In addition, for
each applied check on the variables an additional
column is added which flags observations with a
contradiction given the applied check.
VariableGroupData: The second output summarizes this information
into one
data frame. This output can be used to provide an
executive overview on the amount of contradictions.
VariableGroupTable: A subset of VariableGroupData used within the
pipeline.
SummaryPlot: The third output visualizes summarized information
of SummaryData.
If summarize_categories is TRUE, other objects are returned:
A list with one element Other, a list with the following entries:
One per category named by that category (e.g. "Empirical") containing a
result for contradiction checks within that category only. Additionally, in the
slot all_checks, a result as it would have been returned with
summarize_categories set to FALSE. Finally, in
the top-level list, a slot SummaryData is
returned containing sums per Category and an according ggplot2::ggplot in
SummaryPlot.
Online Documentation for the function meta_data_cross Online Documentation for the required cross-item-level metadata
For each categorical variable, value lists should be defined in the metadata. This implementation will examine, if all observed levels in the study data are valid.
con_inadmissible_categorical( resp_vars = NULL, study_data, label_col, item_level = "item_level", threshold_value = 0, meta_data = item_level, meta_data_v2 )con_inadmissible_categorical( resp_vars = NULL, study_data, label_col, item_level = "item_level", threshold_value = 0, meta_data = item_level, meta_data_v2 )
resp_vars |
variable list the name of the measurement variables |
study_data |
data.frame the data frame that contains the measurements |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
item_level |
data.frame the data frame that contains metadata attributes of study data |
threshold_value |
numeric from=0 to=100. a numerical value ranging from 0-100. |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
Remove missing codes from the study data (if defined in the metadata)
Interpretation of variable specific VALUE_LABELS as supplied in the metadata.
Identification of measurements not corresponding to the expected categories. Therefore two output data frames are generated:
on the level of observation to flag each undefined category, and
a summary table for each variable.
Values not corresponding to defined categories are removed in a data frame of modified study data
a list with:
SummaryData: data frame summarizing inadmissible categories with the
columns:
Variables: variable name/label
OBSERVED_CATEGORIES: the categories observed in the study data
DEFINED_CATEGORIES: the categories defined in the metadata
NON_MATCHING: the categories observed but not defined
NON_MATCHING_N: the number of observations with categories not defined
NON_MATCHING_N_PER_CATEGORY: the number of observations for each of the
unexpected categories
SummaryTable: data frame for the dataquieR pipeline reporting the number
and percentage of inadmissible categorical values
ModifiedStudyData: study data having inadmissible categories removed
FlaggedStudyData: study data having cases with inadmissible categories
flagged
For each categorical variable, value lists should be defined in the metadata. This implementation will examine, if all observed levels in the study data are valid.
con_inadmissible_vocabulary( resp_vars = NULL, study_data, label_col, item_level = "item_level", threshold_value = 0, meta_data = item_level, meta_data_v2 )con_inadmissible_vocabulary( resp_vars = NULL, study_data, label_col, item_level = "item_level", threshold_value = 0, meta_data = item_level, meta_data_v2 )
resp_vars |
variable list the name of the measurement variables |
study_data |
data.frame the data frame that contains the measurements |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
item_level |
data.frame the data frame that contains metadata attributes of study data |
threshold_value |
numeric from=0 to=100. a numerical value ranging from 0-100. |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
Remove missing codes from the study data (if defined in the metadata)
Interpretation of variable specific VALUE_LABELS as supplied in the metadata.
Identification of measurements not corresponding to the expected categories. Therefore two output data frames are generated:
on the level of observation to flag each undefined category, and
a summary table for each variable.
Values not corresponding to defined categories are removed in a data frame of modified study data
a list with:
SummaryData: data frame summarizing inadmissible categories with the
columns:
Variables: variable name/label
OBSERVED_CATEGORIES: the categories observed in the study data
DEFINED_CATEGORIES: the categories defined in the metadata
NON_MATCHING: the categories observed but not defined
NON_MATCHING_N: the number of observations with categories not defined
NON_MATCHING_N_PER_CATEGORY: the number of observations for each of the
unexpected categories
GRADING: indicator TRUE/FALSE if inadmissible categorical values were
observed (more than indicated by the threshold_value)
SummaryTable: data frame for the dataquieR pipeline reporting the number
and percentage of inadmissible categorical values
ModifiedStudyData: study data having inadmissible categories removed
FlaggedStudyData: study data having cases with inadmissible categories
flagged
## Not run: sdt <- data.frame(DIAG = c("B050", "B051", "B052", "B999"), MED0 = c("S01XA28", "N07XX18", "ABC", NA), stringsAsFactors = FALSE) mdt <- tibble::tribble( ~ VAR_NAMES, ~ DATA_TYPE, ~ STANDARDIZED_VOCABULARY_TABLE, ~ SCALE_LEVEL, ~ LABEL, "DIAG", "string", "<ICD10>", "nominal", "Diagnosis", "MED0", "string", "<ATC>", "nominal", "Medication" ) con_inadmissible_vocabulary(NULL, sdt, mdt, label_col = LABEL) prep_load_workbook_like_file("meta_data_v2") il <- prep_get_data_frame("item_level") il$STANDARDIZED_VOCABULARY_TABLE[[11]] <- "<ICD10GM>" il$DATA_TYPE[[11]] <- DATA_TYPES$INTEGER il$SCALE_LEVEL[[11]] <- SCALE_LEVELS$NOMINAL prep_add_data_frames(item_level = il) r <- dq_report2("study_data", dimensions = "con") r <- dq_report2("study_data", dimensions = "con", advanced_options = list(dataquieR.non_disclosure = TRUE)) r ## End(Not run)## Not run: sdt <- data.frame(DIAG = c("B050", "B051", "B052", "B999"), MED0 = c("S01XA28", "N07XX18", "ABC", NA), stringsAsFactors = FALSE) mdt <- tibble::tribble( ~ VAR_NAMES, ~ DATA_TYPE, ~ STANDARDIZED_VOCABULARY_TABLE, ~ SCALE_LEVEL, ~ LABEL, "DIAG", "string", "<ICD10>", "nominal", "Diagnosis", "MED0", "string", "<ATC>", "nominal", "Medication" ) con_inadmissible_vocabulary(NULL, sdt, mdt, label_col = LABEL) prep_load_workbook_like_file("meta_data_v2") il <- prep_get_data_frame("item_level") il$STANDARDIZED_VOCABULARY_TABLE[[11]] <- "<ICD10GM>" il$DATA_TYPE[[11]] <- DATA_TYPES$INTEGER il$SCALE_LEVEL[[11]] <- SCALE_LEVELS$NOMINAL prep_add_data_frames(item_level = il) r <- dq_report2("study_data", dimensions = "con") r <- dq_report2("study_data", dimensions = "con", advanced_options = list(dataquieR.non_disclosure = TRUE)) r ## End(Not run)
Inadmissible numerical values can be of type integer or float. This implementation requires the definition of intervals in the metadata to examine the admissibility of numerical study data.
This helps identify inadmissible measurements according to hard limits (for multiple variables).
con_limit_deviations( resp_vars = NULL, study_data, label_col, item_level = "item_level", meta_data_cross_item = "cross-item_level", limits = NULL, flip_mode = "noflip", return_flagged_study_data = FALSE, return_limit_categorical = TRUE, meta_data = item_level, cross_item_level, `cross-item_level`, meta_data_v2, show_obs = TRUE )con_limit_deviations( resp_vars = NULL, study_data, label_col, item_level = "item_level", meta_data_cross_item = "cross-item_level", limits = NULL, flip_mode = "noflip", return_flagged_study_data = FALSE, return_limit_categorical = TRUE, meta_data = item_level, cross_item_level, `cross-item_level`, meta_data_v2, show_obs = TRUE )
resp_vars |
variable list the name of the measurement variables |
study_data |
data.frame the data frame that contains the measurements |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
item_level |
data.frame the data frame that contains metadata attributes of study data |
meta_data_cross_item |
|
limits |
enum HARD_LIMITS | SOFT_LIMITS | DETECTION_LIMITS. what limits from metadata to check for |
flip_mode |
enum default | flip | noflip | auto. Should the plot be
in default orientation, flipped, not flipped or
auto-flipped. Not all options are always supported.
In general, this con be controlled by
setting the |
return_flagged_study_data |
logical return |
return_limit_categorical |
logical if TRUE return limit deviations also for categorical variables |
meta_data |
data.frame old name for |
cross_item_level |
data.frame alias for |
`cross-item_level` |
data.frame alias for |
meta_data_v2 |
character path to workbook like metadata file, see
|
show_obs |
logical Should (selected) individual observations be marked in the figure for continuous variables? |
Remove missing codes from the study data (if defined in the metadata)
Interpretation of variable specific intervals as supplied in the metadata.
Identification of measurements outside defined limits. Therefore two output data frames are generated:
on the level of observation to flag each deviation, and
a summary table for each variable.
A list of plots is generated for each variable examined for limit deviations. The histogram-like plots indicate respective limits as well as deviations.
Values exceeding limits are removed in a data frame of modified study data
a list with:
FlaggedStudyData data.frame related to the study data by a 1:1
relationship, i.e. for each observation is
checked whether the value is below or above
the limits. Optional, see
return_flagged_study_data.
SummaryTable data.frame summarizing limit deviations for each
variable.
SummaryData data.frame summarizing limit deviations for each
variable for a report.
SummaryPlotList list of ggplot2::ggplots The plots for each variable are
either a histogram (continuous) or a
barplot (discrete).
ReportSummaryTable: heatmap-like data frame about limit violations
description of the contradiction functions
contradiction_functions_descriptionscontradiction_functions_descriptions
Note: in some prep_-functions, this field is named RULE
CONTRADICTION_TERMCONTRADICTION_TERM
Specifies a contradiction rule. Use REDCap like syntax, see
online vignette
Other meta_data_cross:
ASSOCIATION_DIRECTION,
ASSOCIATION_FORM,
ASSOCIATION_METRIC,
ASSOCIATION_RANGE,
CHECK_ID,
CHECK_LABEL,
COMPUTED_VARIABLE_ROLES,
CONTRADICTION_TYPE,
DATA_PREPARATION,
GOLDSTANDARD,
IRV,
MAHALANOBIS_RATIO,
MAHALANOBIS_THRESHOLD,
MAXIMUM_LONG_STRING,
MISS_RESP,
MULTIVARIATE_OUTLIER_CHECK,
MULTIVARIATE_OUTLIER_CHECKTYPE,
RELCOMPL_SPEED,
REL_VAL,
RESPT_PER_ITEM,
SCALE_ACRONYM,
SCALE_NAME,
TOTRESPT,
VARIABLE_LIST,
VARIABLE_LIST_ORDER,
meta_data_computation,
meta_data_cross
Specifies the type of a contradiction. According to the data quality concept, there are logical and empirical contradictions, see online vignette
CONTRADICTION_TYPECONTRADICTION_TYPE
Other meta_data_cross:
ASSOCIATION_DIRECTION,
ASSOCIATION_FORM,
ASSOCIATION_METRIC,
ASSOCIATION_RANGE,
CHECK_ID,
CHECK_LABEL,
COMPUTED_VARIABLE_ROLES,
CONTRADICTION_TERM,
DATA_PREPARATION,
GOLDSTANDARD,
IRV,
MAHALANOBIS_RATIO,
MAHALANOBIS_THRESHOLD,
MAXIMUM_LONG_STRING,
MISS_RESP,
MULTIVARIATE_OUTLIER_CHECK,
MULTIVARIATE_OUTLIER_CHECKTYPE,
RELCOMPL_SPEED,
REL_VAL,
RESPT_PER_ITEM,
SCALE_ACRONYM,
SCALE_NAME,
TOTRESPT,
VARIABLE_LIST,
VARIABLE_LIST_ORDER,
meta_data_computation,
meta_data_cross
For contradiction rules, the required pre-processing steps that can be given.
Note: MISSING_LABEL, MISSING_INTERPRET may not work for non-factor
variables
DATA_PREPARATIONDATA_PREPARATION
LABEL LIMITS MISSING_NA MISSING_LABEL MISSING_INTERPRET
Other meta_data_cross:
ASSOCIATION_DIRECTION,
ASSOCIATION_FORM,
ASSOCIATION_METRIC,
ASSOCIATION_RANGE,
CHECK_ID,
CHECK_LABEL,
COMPUTED_VARIABLE_ROLES,
CONTRADICTION_TERM,
CONTRADICTION_TYPE,
GOLDSTANDARD,
IRV,
MAHALANOBIS_RATIO,
MAHALANOBIS_THRESHOLD,
MAXIMUM_LONG_STRING,
MISS_RESP,
MULTIVARIATE_OUTLIER_CHECK,
MULTIVARIATE_OUTLIER_CHECKTYPE,
RELCOMPL_SPEED,
REL_VAL,
RESPT_PER_ITEM,
SCALE_ACRONYM,
SCALE_NAME,
TOTRESPT,
VARIABLE_LIST,
VARIABLE_LIST_ORDER,
meta_data_computation,
meta_data_cross
In the metadata, the following entries are allowed for the variable attribute DATA_TYPE:
DATA_TYPESDATA_TYPES
integer for integer numbers
string for text/string/character data
float for decimal/floating point numbers
datetime for timepoints
time for time of day
As function arguments, dataquieR uses additional type specifications:
numeric is a numerical value (float or integer), but it is not an
allowed DATA_TYPE in the metadata. However, some functions may accept
float or integer for specific function arguments. This is, where we
use the term numeric.
enum allows one element out of a set of allowed options similar to
match.arg
set allows a subset out of a set of allowed options similar to
match.arg with several.ok = TRUE.
variable Function arguments of this type expect a character scalar that
specifies one variable using the variable identifier given in
the metadata attribute VAR_NAMES or, if label_col is set,
given in the metadata attribute given in that argument.
Labels can easily be translated using prep_map_labels
variable list Function arguments of this type expect a character vector
that specifies variables using the variable identifiers
given in the metadata attribute VAR_NAMES or,
if label_col is set, given in the metadata attribute
given in that argument. Labels can easily be translated
using prep_map_labels
All available data types, mapped from their respective R types
DATA_TYPES_OF_R_TYPEDATA_TYPES_OF_R_TYPE
creates an object of the class dataquieR_resultset.
dataquieR_resultset(...)dataquieR_resultset(...)
... |
properties stored in the object |
The class features the following methods:
as.data.frame.dataquieR_resultset, * as.list.dataquieR_resultset, * print.dataquieR_resultset, * summary.dataquieR_resultset
an object of the class dataquieR_resultset.
Deprecated
dataquieR_resultset_verify(...)dataquieR_resultset_verify(...)
... |
Deprecated |
Deprecated
If this option is set to TRUE, time course plots will only show subgroups
with more than one distinct value. This might improve the readability of
the figure.
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
TODO
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
The value should be greater than 0 and less than or equal to 1. In general, increasing the bandwidth leads to a smoother trend line.
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
acc_loess()
specify the minimum number of observations required for each of the subgroups. Subgroups with fewer observations are excluded. The default number is 30.
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
The value should be greater than 0 and less than 0.4. If the proportion of cases or controls is lower than the specified value, the LOESS figure will not be created for the specified binary outcome.
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
acc_loess()
TODO
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
TODO
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
If this option is set to FALSE, the figures created by acc_margins will
not include the number of observations for each level of the grouping
variable. This can be used to obtain clean static plots.
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
If this option is set to TRUE, the levels of the grouping variable in the
figure are sorted in descending order according to the number of
observations so that levels with more observations are easier to identify.
Otherwise, the original order of the levels is retained.
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
boolean, TRUE or FALSE
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
acc_shape_or_scale() and acc_end_digits()
TODO
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
Amending metadata could make the function running, e.g., a test for missingness without any declared missing codes
dataquieR.applicability_problemdataquieR.applicability_problem
TODO
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
TODO
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
TODO
Other options:
dataquieR,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
to be deprecated
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
storr back-end, do not convert to base-listif TRUE and a report uses a storr-back-end, convert it to a base list,
i.e., copy to the RAM, even if this would likely not be really needed for
apply-calls
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
browser() on errorsTODO
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
can be
TRUE: values outside hard limits will be removed from the data
before calculating descriptive statistics
FALSE: values outside hard limits will not be removed from the original
data
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
dataquieR function resultsTODO
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
if TRUE, levels not taken will not be displayed when printing/plotting
heatmap tables
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
also reports inadmissible data types. can be turned off for performance reasons, if the data source is already type-safe (e.g., a database) use with care, may cause pipelines breaking (maybe only in the final rendering step), if the data type is incorrectly set for some columns.
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
none: no check will be provided about the match of variables and
records available in the study data and described in the metadata
exact: There must be a 1:1 match between the study data and metadata
regarding data frames and segments variables and records
subset_u: study data are a subset of metadata. All variables from the study
data are expected to be present in the metadata, but one or
more variables in the metadata are not expected to be
present in the study data.
In this case a variable present in
the study data but not in the metadata would produce an issue.
subset_m: metadata are a subset of study data. All variables in the metadata
are expected to be present in the study data, but one or more
variables in the study data are not expected to be
present in the metadata.
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
to be deprecated
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
If a file does not feature column data types or features data types cell-based, choose that type which matches the majority of the sampled cells of a column for the column's data type.
This may make you miss data type problems but it could fix them, so
prep_get_data_frame() works better.
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
TODO
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
TODO
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
label_col argument is used.TODO
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
If this option is set to TRUE, time course plots will use general additive
models (GAM) instead of LOESS when the number of observations exceeds a
specified threshold. LOESS computations for large datasets have a high
memory consumption.
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
TODO
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
TODO
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
By default, the DATA_TYPE is derived from the R data type of the study
data. However, when data are imported from plain text files, it can be more
appropriate to examine the actual values and infer the data type based on
their content. This option enables that behavior: set
dataquieR.guess_character to TRUE to infer data types from the observed
values rather than relying solely on the column’s class in the data frame.
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
dataquieR tries to guess missing-codes from the study data in absence of metadataTODO
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
remove variables with only empty values (NA, ". ",
"" or similar) from reports. auto means, such variables are removed, if
we have more than 20% of the variables empty.
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
Other study_data_cache:
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_metrics_env_default,
dataquieR.study_data_cache_quick_fill
Also amending meta data could not make the function running, e.g., a test for numbers applied to a character.
dataquieR.intrinsic_applicability_problemdataquieR.intrinsic_applicability_problem
TODO
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
if TRUE, plots are not realized until needed in side reports to save
memory.
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
if TRUE, realized plots are cached, may need more memory.
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
ggplot2 objects as possibleif TRUE, plot promises are blessed in an S7 class so they behave almost
like "real" ggplot2 objects, so you normally do not need to call
prep_realize_ggplot() on them. However, this comes with a small memory
overhead, so, you can disable this.
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
the language to use for type conversions (en, de, fr, cn, ca, ...)
only used by util_adjust_data_type2(), currently
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
a number, see corresponding argument in acc_mahalanobis()
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
If there are more levels of a categorical response variable than can be shown individually, they will be collapsed into "other".
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
If there are more examiners or devices than can be shown individually, they will be collapsed into "other".
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
If there are more examiners or devices, the figure will be reduced to box-plots to save space.
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
All variable labels will be shortened to fit this maximum length. Cannot be larger than 200 for technical reasons.
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
All long variable labels will be shortened to fit this maximum length. Cannot be larger than 200 for technical reasons.
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
value labels are restricted to this length
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
to be deprecated
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
Levels of the grouping variable with fewer observations than specified here will be excluded from the figure.
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
If there are less observations for an individual level of a categorical variable, it will not be shown in the time course plot.
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
can be
TRUE: for cross-item_level-groups with MULTIVARIATE_OUTLIER_CHECK
empty, do a multivariate outlier check
FALSE: for cross-item_level-groups with MULTIVARIATE_OUTLIER_CHECK
empty, don't do a multivariate outlier check
"auto": for cross-item_level-groups with MULTIVARIATE_OUTLIER_CHECK
empty, do multivariate outlier checks, if there is no entry in
the column CONTRADICTION_TERM.
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
so they are not always including 0 and 1.
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
Suppress overall distribution in 'margins' figures for binary outcomes
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
TODO
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
if study_data comes as a data frame, it may already feature factors. if
a column has the DATA_TYPE integer in the meta data, the factor was
converted to integer using as.integer(), which caused unexpected behavior.
if this option is set to "FALSE" (the new default), the conversion will now
try to apply as.character(column_data), first.
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
character use the old type conversion code (slower)
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
as described in dataquieR.study_data_cache_max, different flavors of the study data are cached. With this option, you control, if before a report is computed, a frequently needed bunch of such flavors are pre-computed and distributed to the compute nodes. However, this may be time- and RAM- consuming, so, you can turn the pre-computation off, which will still allow the individual compute nodes to maintain such a cache but then growing on demand on individual nodes separately, only. If dataquieR.study_data_cache_max cannot handle all flavors, they may still be pre-computed but immediately discarded.
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
Other study_data_cache:
dataquieR.ignore_empty_vars,
dataquieR.print_block_load_factor,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_metrics_env_default,
dataquieR.study_data_cache_quick_fill
multiply size of parallel compute blocks by this factor. the higher it is set, the less smooth progress bar grows, but setting it to a huge number can really speed up the rendering process by approx. 10%. Either set to 1 for full progress control or large (e.g., 1000000) for maximum speed.
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
Other study_data_cache:
dataquieR.ignore_empty_vars,
dataquieR.precomputeStudyData,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_metrics_env_default,
dataquieR.study_data_cache_quick_fill
TODO
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
TODO
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
storr back-end, re-use itif TRUE, computation won't be repeated, if a result already exist in the
output storr
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
print()
if TRUE and a report was already partially printed with also this option
TRUE, then, a second call to print() will resume the printing.
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
If SCALE_LEVEL is not specified in the meta_data, it will be inferred
using a heuristic. This option defines, for numeric variables, the maximum
number of distinct data values for a variable to be considered ordinal.
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
If SCALE_LEVEL is not specified in the meta_data, it will be inferred
using a heuristic. This option defines, for numeric variables, the maximum
number of distinct data values for a variable to be considered categorical,
not ordinal.
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
dataquieR caches all used flavors of curated study data, e.g., having
missing codes replaced by NAs, having hard limits replaced by NA, ...
For larger sets of study data this can be very RAM consuming, so you can
control here the maximum size for this cache. Also, this cache is distributed
to all compute nodes in case of parallel computation, which may be very time-
consuming, and, on single-node-parallelization, also, it may be even more
RAM-consuming then.
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
Other study_data_cache:
dataquieR.ignore_empty_vars,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_metrics_env_default,
dataquieR.study_data_cache_quick_fill
if TRUE, collect metrics on the usage of the study data cache
described here: dataquieR.study_data_cache_max. Won't work, fully,
if running in parallel.
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
Other study_data_cache:
dataquieR.ignore_empty_vars,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_metrics_env_default,
dataquieR.study_data_cache_quick_fill
this is the environment, where metrics will be stored, if
dataquieR.study_data_cache_metrics-option() has been set TRUE.
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
Other study_data_cache:
dataquieR.ignore_empty_vars,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env_default,
dataquieR.study_data_cache_quick_fill
dataquieR.study_data_cache_metrics_env_defaultdataquieR.study_data_cache_metrics_env_default
Other study_data_cache:
dataquieR.ignore_empty_vars,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill
as described in dataquieR.precomputeStudyData, different flavors of
the study data are cached. With this option, you control, if before a report
is computed, only frequently needed bunch of such flavors are pre-computed,
or simply all possible flavors. Won't have any effect, if pre-computation
has been turned off.
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
Other study_data_cache:
dataquieR.ignore_empty_vars,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_metrics_env_default
if TRUE, colnames(study_data) replaced by the capitalization used in the
metadata using a case-insensitive matching, first.
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
TODO
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
Caveat: Needs really much memory
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.type_adjust_parallel,
progress_init_fkt
dq_report2() was called with cores = 2 or higher.character try to do type adjustments in parallel
only, if dq_report2() was called with cores = 2 or higher.
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
progress_init_fkt
TODO
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
to be deprecated
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel,
progress_init_fkt
works on variable groups (cross-item_level), which are expected to show
a Pearson correlation
des_scatterplot_matrix( label_col, study_data, item_level = "item_level", meta_data_cross_item = "cross-item_level", meta_data = item_level, meta_data_v2, cross_item_level, `cross-item_level` )des_scatterplot_matrix( label_col, study_data, item_level = "item_level", meta_data_cross_item = "cross-item_level", meta_data = item_level, meta_data_v2, cross_item_level, `cross-item_level` )
label_col |
variable attribute the name of the column in the metadata with labels of variables |
study_data |
data.frame the data frame that contains the measurements |
item_level |
data.frame the data frame that contains metadata attributes of study data |
meta_data_cross_item |
|
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
cross_item_level |
data.frame alias for |
`cross-item_level` |
data.frame alias for |
Descriptor # TODO: This can be an indicator
a list with the slots:
SummaryPlotList: for each variable group a ggplot2::ggplot object with
pairwise correlation plots
SummaryData: table with columns VARIABLE_LIST, cors,
max_cor, min_cor
SummaryTable: like SummaryData, but machine readable and with
stable column names.
## Not run: devtools::load_all() prep_load_workbook_like_file("meta_data_v2") des_scatterplot_matrix("study_data") ## End(Not run)## Not run: devtools::load_all() prep_load_workbook_like_file("meta_data_v2") des_scatterplot_matrix("study_data") ## End(Not run)
generates a descriptive overview of the variables in resp_vars.
des_summary( resp_vars = NULL, study_data, label_col, item_level = "item_level", meta_data = item_level, meta_data_v2, hard_limits_removal = getOption("dataquieR.des_summary_hard_lim_remove", dataquieR.des_summary_hard_lim_remove_default), ... )des_summary( resp_vars = NULL, study_data, label_col, item_level = "item_level", meta_data = item_level, meta_data_v2, hard_limits_removal = getOption("dataquieR.des_summary_hard_lim_remove", dataquieR.des_summary_hard_lim_remove_default), ... )
resp_vars |
variable the name of the measurement variables |
study_data |
data.frame the data frame that contains the measurements |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
item_level |
data.frame the data frame that contains metadata attributes of study data |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
hard_limits_removal |
logical if TRUE values outside hard limits are removed from the data before calculating descriptive statistics. The default is FALSE |
... |
arguments to be passed to all called indicator functions if applicable. |
TODO
a list with:
SummaryTable: data.frame
SummaryData: data.frame
## Not run: xx <- des_summary(study_data = "study_data", meta_data_v2 = "meta_data_v2") xx$SummaryData ## End(Not run)## Not run: xx <- des_summary(study_data = "study_data", meta_data_v2 = "meta_data_v2") xx$SummaryData ## End(Not run)
generates a descriptive overview of the categorical variables (nominal and
ordinal) in resp_vars.
des_summary_categorical( resp_vars = NULL, study_data, label_col, item_level = "item_level", meta_data = item_level, meta_data_v2, hard_limits_removal = getOption("dataquieR.des_summary_hard_lim_remove", dataquieR.des_summary_hard_lim_remove_default), ... )des_summary_categorical( resp_vars = NULL, study_data, label_col, item_level = "item_level", meta_data = item_level, meta_data_v2, hard_limits_removal = getOption("dataquieR.des_summary_hard_lim_remove", dataquieR.des_summary_hard_lim_remove_default), ... )
resp_vars |
variable the name of the categorical measurement variable |
study_data |
data.frame the data frame that contains the measurements |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
item_level |
data.frame the data frame that contains metadata attributes of study data |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
hard_limits_removal |
logical if TRUE values outside hard limits are removed from the data before calculating descriptive statistics. The default is FALSE |
... |
arguments to be passed to all called indicator functions if applicable. |
TODO
a list with:
SummaryTable: data.frame
SummaryData: data.frame
## Not run: prep_load_workbook_like_file("meta_data_v2") xx <- des_summary_categorical(study_data = "study_data", meta_data = prep_get_data_frame("item_level")) util_html_table(xx$SummaryData) util_html_table(des_summary_categorical(study_data = prep_get_data_frame("study_data"), meta_data = prep_get_data_frame("item_level"))$SummaryData) ## End(Not run)## Not run: prep_load_workbook_like_file("meta_data_v2") xx <- des_summary_categorical(study_data = "study_data", meta_data = prep_get_data_frame("item_level")) util_html_table(xx$SummaryData) util_html_table(des_summary_categorical(study_data = prep_get_data_frame("study_data"), meta_data = prep_get_data_frame("item_level"))$SummaryData) ## End(Not run)
generates a descriptive overview of continuous variables (ratio and interval) in resp_vars.
des_summary_continuous( resp_vars = NULL, study_data, label_col, item_level = "item_level", meta_data = item_level, meta_data_v2, hard_limits_removal = getOption("dataquieR.des_summary_hard_lim_remove", dataquieR.des_summary_hard_lim_remove_default), ... )des_summary_continuous( resp_vars = NULL, study_data, label_col, item_level = "item_level", meta_data = item_level, meta_data_v2, hard_limits_removal = getOption("dataquieR.des_summary_hard_lim_remove", dataquieR.des_summary_hard_lim_remove_default), ... )
resp_vars |
variable the name of the continuous measurement variable |
study_data |
data.frame the data frame that contains the measurements |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
item_level |
data.frame the data frame that contains metadata attributes of study data |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
hard_limits_removal |
logical if TRUE values outside hard limits are removed from the data before calculating descriptive statistics. The default is FALSE |
... |
arguments to be passed to all called indicator functions if applicable. |
TODO
a list with:
SummaryTable: data.frame
SummaryData: data.frame
## Not run: prep_load_workbook_like_file("meta_data_v2") xx <- des_summary_continuous(study_data = "study_data", meta_data = prep_get_data_frame("item_level")) xx$SummaryData ## End(Not run)## Not run: prep_load_workbook_like_file("meta_data_v2") xx <- des_summary_continuous(study_data = "study_data", meta_data = prep_get_data_frame("item_level")) xx$SummaryData ## End(Not run)
Name of the data frame
DF_CODEDF_CODE
Number of expected data elements in a data frame. numeric. Check only conducted if number entered
DF_ELEMENT_COUNTDF_ELEMENT_COUNT
The name of the data frame containing the reference IDs to be compared with the IDs in the study data set.
DF_ID_REF_TABLEDF_ID_REF_TABLE
All variables that are to be used as one single ID variable (combined key) in a data frame.
DF_ID_VARSDF_ID_VARS
Name of the data frame
DF_NAMEDF_NAME
The type of check to be conducted when comparing the reference ID table with the IDs delivered in the study data files.
DF_RECORD_CHECKDF_RECORD_CHECK
Number of expected data records in a data frame. numeric. Check only conducted if number entered
DF_RECORD_COUNTDF_RECORD_COUNT
Defines expectancies on the uniqueness of the IDs across the rows of a data frame, or the number of times some ID can be repeated.
DF_UNIQUE_IDDF_UNIQUE_ID
Specifies whether identical data is permitted across rows in a data frame (excluding ID variables)
DF_UNIQUE_ROWSDF_UNIQUE_ROWS
dq_report2 resultGet the dimensions of a dq_report2 result
## S3 method for class 'dataquieR_resultset2' dim(x)## S3 method for class 'dataquieR_resultset2' dim(x)
x |
a |
dimensions
a vector of data quality dimensions. The supported dimensions are Completeness, Consistency and Accuracy.
dimensionsdimensions
Only a definition, not a function, so no return value
dataquieR report object (v2.0)Names of a dataquieR report object (v2.0)
## S3 method for class 'dataquieR_resultset2' dimnames(x)## S3 method for class 'dataquieR_resultset2' dimnames(x)
x |
the result object |
the names
order does matter, because it defines the order in the dq_report2.
dimsdims
util_html_for_var()
util_html_for_dims()
uniform For uniform distribution
normal For Gaussian distribution
gamma For a gamma distribution
DISTRIBUTIONSDISTRIBUTIONS
Deprecated
dq_report(...)dq_report(...)
... |
Deprecated |
Deprecated
Generate a stratified full DQ report
dq_report_by( study_data, item_level = "item_level", meta_data_segment = "segment_level", meta_data_dataframe = "dataframe_level", meta_data_cross_item = "cross-item_level", meta_data_item_computation = "item_computation_level", missing_tables = NULL, label_col, meta_data_v2, segment_column = NULL, strata_column = NULL, strata_select = NULL, selection_type = NULL, segment_select = NULL, segment_exclude = NULL, strata_exclude = NULL, subgroup = NULL, resp_vars = character(0), id_vars = NULL, advanced_options = list(), storr_factory = NULL, amend = FALSE, checkpoint_resumed = getOption("dataquieR.resume_checkpoint", dataquieR.resume_checkpoint_default), ..., output_dir = NULL, input_dir = NULL, also_print = FALSE, force_overwrite = FALSE, disable_plotly = FALSE, view = TRUE, meta_data = item_level, cross_item_level, `cross-item_level`, segment_level, dataframe_level, item_computation_level, author = prep_get_user_name(), title = ifelse(is.null(output_dir), "Data quality report Bundle", paste0(basename(output_dir))), subtitle = as.character(Sys.Date()), user_info = NULL )dq_report_by( study_data, item_level = "item_level", meta_data_segment = "segment_level", meta_data_dataframe = "dataframe_level", meta_data_cross_item = "cross-item_level", meta_data_item_computation = "item_computation_level", missing_tables = NULL, label_col, meta_data_v2, segment_column = NULL, strata_column = NULL, strata_select = NULL, selection_type = NULL, segment_select = NULL, segment_exclude = NULL, strata_exclude = NULL, subgroup = NULL, resp_vars = character(0), id_vars = NULL, advanced_options = list(), storr_factory = NULL, amend = FALSE, checkpoint_resumed = getOption("dataquieR.resume_checkpoint", dataquieR.resume_checkpoint_default), ..., output_dir = NULL, input_dir = NULL, also_print = FALSE, force_overwrite = FALSE, disable_plotly = FALSE, view = TRUE, meta_data = item_level, cross_item_level, `cross-item_level`, segment_level, dataframe_level, item_computation_level, author = prep_get_user_name(), title = ifelse(is.null(output_dir), "Data quality report Bundle", paste0(basename(output_dir))), subtitle = as.character(Sys.Date()), user_info = NULL )
study_data |
data.frame the data frame that contains the measurements:
it can be an R object (e.g., |
item_level |
data.frame the data frame that contains metadata attributes of study data |
meta_data_segment |
data.frame – optional: Segment level metadata |
meta_data_dataframe |
data.frame – optional if |
meta_data_cross_item |
data.frame – optional: Cross-item level metadata |
meta_data_item_computation |
data.frame – optional: Computed items metadata |
missing_tables |
character the name of the data frame containing the
missing codes, it can be a vector if more
than one table is provided. Example:
|
label_col |
variable attribute the name of the column in the metadata containing the labels of the variables |
meta_data_v2 |
character path or file name of the workbook like
metadata file, see
|
segment_column |
variable attribute name of a metadata attribute usable to split the report in sections of variables, e.g. all blood-pressure related variables. By default, reports are split by STUDY_SEGMENT if available and no segment_column nor strata_column or subgroup are defined. To create an un-split report please write explicitly the argument 'segment_column = NULL' |
strata_column |
variable name of a study variable to stratify the
report by, e.g. the study centers.
Both labels and |
strata_select |
character if given, the strata of strata_column are limited to the content of this vector. A character vector or a regular expression can be provided (e.g., "^a.*$"). This argument can not be used if no strata_column is provided |
selection_type |
character optional, can only be specified if a
|
segment_select |
character if given, the levels of segment_column are limited to the content of this vector. A character vector or a regular expression (e.g., ".*_EXAM$") can be provided. This argument can not be used if no segment_column is provided. |
segment_exclude |
character optional, can only be specified if a
|
strata_exclude |
character optional, can only be specified if a
|
subgroup |
character optional, to define subgroups of cases. Rules are
to be written as |
resp_vars |
variable the names of the measurement variables, if
missing or |
id_vars |
variable a vector containing the name/s of the variables
containing ids, to
be used to merge multiple data frames if provided
in |
advanced_options |
list options to set during report computation,
see |
storr_factory |
function |
amend |
logical if there is already data in. |
checkpoint_resumed |
logical if using a |
... |
arguments to be passed through to dq_report or dq_report2 |
output_dir |
character if given, the output is not returned but saved in this directory |
input_dir |
character if given, the study data files that have
no path and that are not URL are searched in
this directory. Also |
also_print |
logical if |
force_overwrite |
logical force to overwrite |
disable_plotly |
logical do not use |
view |
logical open the returned report |
meta_data |
data.frame old name for |
cross_item_level |
data.frame alias for |
`cross-item_level` |
data.frame alias for |
segment_level |
data.frame alias for |
dataframe_level |
data.frame alias for |
item_computation_level |
data.frame alias for
|
author |
character author for the report bundle's documents. |
title |
character optional argument to specify the title for the data quality report bundle |
subtitle |
character optional argument to specify a subtitle for the data quality report bundle |
user_info |
list additional info stored with the report bundle, e.g., comments, title, ... |
A named list of named lists of dq_report2 reports, returned
invisibly unless view = TRUE. If output_dir is given, the result
is still returned (invisibly), and optionally opened in a browser
(view = TRUE, also_print = TRUE).
## Not run: # really long-running example. prep_load_workbook_like_file("meta_data_v2") rep <- dq_report_by("study_data", label_col = LABEL, strata_column = "CENTER_0") rep <- dq_report_by("study_data", label_col = LABEL, strata_column = "CENTER_0", segment_column = NULL ) unlink("/tmp/testRep/", force = TRUE, recursive = TRUE) dq_report_by("study_data", label_col = LABEL, strata_column = "CENTER_0", segment_column = STUDY_SEGMENT, output_dir = "/tmp/testRep" ) unlink("/tmp/testRep/", force = TRUE, recursive = TRUE) dq_report_by("study_data", label_col = LABEL, strata_column = "CENTER_0", segment_column = NULL, output_dir = "/tmp/testRep" ) dq_report_by("study_data", label_col = LABEL, segment_column = STUDY_SEGMENT, output_dir = "/tmp/testRep" ) dq_report_by("study_data", label_col = LABEL, segment_column = STUDY_SEGMENT, output_dir = "/tmp/testRep", also_print = TRUE ) dq_report_by(study_data = "study_data", meta_data_v2 = "meta_data_v2", advanced_options = list(dataquieR.study_data_cache_max = 0, dataquieR.study_data_cache_metrics = TRUE, dataquieR.study_data_cache_metrics_env = environment()), cores = NULL, dimensions = "int") dq_report_by(study_data = "study_data", meta_data_v2 = "meta_data_v2", advanced_options = list(dataquieR.study_data_cache_max = 0), cores = NULL, dimensions = "int") ## End(Not run)## Not run: # really long-running example. prep_load_workbook_like_file("meta_data_v2") rep <- dq_report_by("study_data", label_col = LABEL, strata_column = "CENTER_0") rep <- dq_report_by("study_data", label_col = LABEL, strata_column = "CENTER_0", segment_column = NULL ) unlink("/tmp/testRep/", force = TRUE, recursive = TRUE) dq_report_by("study_data", label_col = LABEL, strata_column = "CENTER_0", segment_column = STUDY_SEGMENT, output_dir = "/tmp/testRep" ) unlink("/tmp/testRep/", force = TRUE, recursive = TRUE) dq_report_by("study_data", label_col = LABEL, strata_column = "CENTER_0", segment_column = NULL, output_dir = "/tmp/testRep" ) dq_report_by("study_data", label_col = LABEL, segment_column = STUDY_SEGMENT, output_dir = "/tmp/testRep" ) dq_report_by("study_data", label_col = LABEL, segment_column = STUDY_SEGMENT, output_dir = "/tmp/testRep", also_print = TRUE ) dq_report_by(study_data = "study_data", meta_data_v2 = "meta_data_v2", advanced_options = list(dataquieR.study_data_cache_max = 0, dataquieR.study_data_cache_metrics = TRUE, dataquieR.study_data_cache_metrics_env = environment()), cores = NULL, dimensions = "int") dq_report_by(study_data = "study_data", meta_data_v2 = "meta_data_v2", advanced_options = list(dataquieR.study_data_cache_max = 0), cores = NULL, dimensions = "int") ## End(Not run)
Generate a full DQ report, v2
dq_report2( study_data, item_level = "item_level", label_col = LABEL, meta_data_segment = "segment_level", meta_data_dataframe = "dataframe_level", meta_data_cross_item = "cross-item_level", meta_data_item_computation = "item_computation_level", meta_data = item_level, meta_data_v2, ..., dimensions = c("Completeness", "Consistency"), cores = list(mode = "socket", logging = FALSE, cpus = util_detect_cores(), load.balancing = TRUE), ignore_empty_vars = getOption("dataquieR.ignore_empty_vars", dataquieR.ignore_empty_vars_default), specific_args = list(), advanced_options = list(), author = prep_get_user_name(), title = "Data quality report", subtitle = as.character(Sys.Date()), user_info = NULL, debug_parallel = FALSE, resp_vars = character(0), filter_indicator_functions = character(0), exclude_indicator_functions = character(0), filter_result_slots = c("^Summary", "^Segment", "^DataTypePlotList", "^ReportSummaryTable", "^Dataframe", "^Result", "^VariableGroup"), mode = c("default", "futures", "queue", "parallel"), mode_args = list(), notes_from_wrapper = list(), storr_factory = NULL, amend = FALSE, cross_item_level, `cross-item_level`, segment_level, dataframe_level, item_computation_level, .internal = rlang::env_inherits(rlang::caller_env(), parent.env(environment())), checkpoint_resumed = getOption("dataquieR.resume_checkpoint", dataquieR.resume_checkpoint_default), name_of_study_data, dt_adjust = as.logical(getOption("dataquieR.dt_adjust", dataquieR.dt_adjust_default)), output_dir = NULL, force_overwrite = FALSE )dq_report2( study_data, item_level = "item_level", label_col = LABEL, meta_data_segment = "segment_level", meta_data_dataframe = "dataframe_level", meta_data_cross_item = "cross-item_level", meta_data_item_computation = "item_computation_level", meta_data = item_level, meta_data_v2, ..., dimensions = c("Completeness", "Consistency"), cores = list(mode = "socket", logging = FALSE, cpus = util_detect_cores(), load.balancing = TRUE), ignore_empty_vars = getOption("dataquieR.ignore_empty_vars", dataquieR.ignore_empty_vars_default), specific_args = list(), advanced_options = list(), author = prep_get_user_name(), title = "Data quality report", subtitle = as.character(Sys.Date()), user_info = NULL, debug_parallel = FALSE, resp_vars = character(0), filter_indicator_functions = character(0), exclude_indicator_functions = character(0), filter_result_slots = c("^Summary", "^Segment", "^DataTypePlotList", "^ReportSummaryTable", "^Dataframe", "^Result", "^VariableGroup"), mode = c("default", "futures", "queue", "parallel"), mode_args = list(), notes_from_wrapper = list(), storr_factory = NULL, amend = FALSE, cross_item_level, `cross-item_level`, segment_level, dataframe_level, item_computation_level, .internal = rlang::env_inherits(rlang::caller_env(), parent.env(environment())), checkpoint_resumed = getOption("dataquieR.resume_checkpoint", dataquieR.resume_checkpoint_default), name_of_study_data, dt_adjust = as.logical(getOption("dataquieR.dt_adjust", dataquieR.dt_adjust_default)), output_dir = NULL, force_overwrite = FALSE )
study_data |
data.frame the data frame that contains the measurements |
item_level |
data.frame the data frame that contains metadata attributes of study data |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
meta_data_segment |
data.frame – optional: Segment level metadata |
meta_data_dataframe |
data.frame – optional: Data frame level metadata |
meta_data_cross_item |
data.frame – optional: Cross-item level metadata |
meta_data_item_computation |
data.frame optional. computation rules for computed variables. |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
... |
arguments to be passed to all called indicator functions if applicable. |
dimensions |
dimensions Vector of dimensions to address in the report. Allowed values in the vector are Completeness, Consistency, and Accuracy. The generated report will only cover the listed data quality dimensions. Accuracy is computational expensive, so this dimension is not enabled by default. Completeness should be included, if Consistency is included, and Consistency should be included, if Accuracy is included to avoid misleading detections of e.g. missing codes as outliers, please refer to the data quality concept for more details. Integrity is always included. If dimensions is equal to NULL or "all", all dimensions will be covered. |
cores |
integer number of cpu cores to use or a named list with arguments for parallelMap::parallelStart or NULL, if parallel has already been started by the caller. Can also be a cluster. |
ignore_empty_vars |
enum TRUE | FALSE | auto. See dataquieR.ignore_empty_vars. |
specific_args |
list named list of arguments specifically for one of the called functions, the of the list elements correspond to the indicator functions whose calls should be modified. The elements are lists of arguments. |
advanced_options |
list options to set during report computation,
see |
author |
character author for the report documents. |
title |
character optional argument to specify the title for the data quality report |
subtitle |
character optional argument to specify a subtitle for the data quality report |
user_info |
list additional info stored with the report, e.g., comments, title, ... |
debug_parallel |
logical print blocks currently evaluated in parallel |
resp_vars |
variable list the name of the measurement variables for the report. If missing, all variables will be used. Only item level indicator functions are filtered, so far. |
filter_indicator_functions |
character regular expressions, only if an indicator function's name matches one of these, it'll be used for the report. If of length zero, no filtering is performed. |
exclude_indicator_functions |
character regular expressions, if an indicator function's name matches one of these, it'll be excluded from the report. If of length zero, no filtering is performed. |
filter_result_slots |
character regular expressions, only if an indicator function's result's name matches one of these, it'll be used for the report. If of length zero, no filtering is performed. |
mode |
character work mode for parallel execution. default is "default", the values mean:
|
mode_args |
list of arguments for the selected |
notes_from_wrapper |
list a list containing notes about changed labels
by |
storr_factory |
function |
amend |
logical if there is already data in. |
cross_item_level |
data.frame alias for |
`cross-item_level` |
data.frame alias for |
segment_level |
data.frame alias for |
dataframe_level |
data.frame alias for |
item_computation_level |
data.frame alias for
|
.internal |
logical internal use, only. |
checkpoint_resumed |
logical if using a |
name_of_study_data |
character name for study data inside the report, internal use. |
dt_adjust |
logical whether to trust data types in the study data. if
|
output_dir |
character if |
force_overwrite |
logical force to overwrite |
See dq_report_by for a way to generate stratified or splitted reports easily.
a dataquieR_resultset2 that can be
printed creating a HTML-report.
ReportSummaryTable
Remove unused levels from ReportSummaryTable
## S3 method for class 'ReportSummaryTable' droplevels(x, ...)## S3 method for class 'ReportSummaryTable' droplevels(x, ...)
x |
|
... |
no used. |
ReportSummaryTable with all (NA or 0)-columns removed
These S3/S7 methods make dq_lazy_ggplot/dq_lazy_ggplot_s7
objects work smoothly with
functions from ggplot2 and plotly. They simply materialize
the underlying ggplot object and then delegate to the respective
generic.
ggplotGrob.dq_lazy_ggplot(x, ...) ggplotly.dq_lazy_ggplot_s7(p, ...) plotly_build.dq_lazy_ggplot_s7(p, ...) ggplotly.dq_lazy_ggplot(p, ...) plotly_build.dq_lazy_ggplot(p, ...) ggplotGrob.dq_lazy_ggplot_s7(x, ...)ggplotGrob.dq_lazy_ggplot(x, ...) ggplotly.dq_lazy_ggplot_s7(p, ...) plotly_build.dq_lazy_ggplot_s7(p, ...) ggplotly.dq_lazy_ggplot(p, ...) plotly_build.dq_lazy_ggplot(p, ...) ggplotGrob.dq_lazy_ggplot_s7(x, ...)
x, p
|
A |
... |
Further arguments passed on to the underlying generic. |
The return value is the same as for the corresponding generic:
ggplotGrob() returns a gtable object.
ggplotly() returns a plotly object.
plotly_build() returns a plotly_proxy or similar.
ggplotGrob,
plotly::ggplotly
plotly::plotly_build
Defines the measurement variable to be used as a known gold standard. Only one variable can be defined as the gold standard.
GOLDSTANDARDGOLDSTANDARD
Other meta_data_cross:
ASSOCIATION_DIRECTION,
ASSOCIATION_FORM,
ASSOCIATION_METRIC,
ASSOCIATION_RANGE,
CHECK_ID,
CHECK_LABEL,
COMPUTED_VARIABLE_ROLES,
CONTRADICTION_TERM,
CONTRADICTION_TYPE,
DATA_PREPARATION,
IRV,
MAHALANOBIS_RATIO,
MAHALANOBIS_THRESHOLD,
MAXIMUM_LONG_STRING,
MISS_RESP,
MULTIVARIATE_OUTLIER_CHECK,
MULTIVARIATE_OUTLIER_CHECKTYPE,
RELCOMPL_SPEED,
REL_VAL,
RESPT_PER_ITEM,
SCALE_ACRONYM,
SCALE_NAME,
TOTRESPT,
VARIABLE_LIST,
VARIABLE_LIST_ORDER,
meta_data_computation,
meta_data_cross
grid.draw method for util_pairs_ggplot_panels objectsgrid.draw method for util_pairs_ggplot_panels objects
## S3 method for class 'util_pairs_ggplot_panels' grid.draw(x, ...)## S3 method for class 'util_pairs_ggplot_panels' grid.draw(x, ...)
x |
An object of class |
... |
Ignored. |
clipboard
HTML Dependency for report headers in clipboard
html_dependency_clipboard()html_dependency_clipboard()
the dependency
dataquieR
generate all dependencies used in static dataquieR reports
html_dependency_dataquieR(iframe = FALSE)html_dependency_dataquieR(iframe = FALSE)
iframe |
logical |
the dependency
jsPDF
Provides jsPDF for use in Shiny or RMarkdown via htmltools.
html_dependency_jspdf()html_dependency_jspdf()
An htmltools::htmlDependency() object
DT::datatable
HTML Dependency for report headers in DT::datatable
html_dependency_report_dt()html_dependency_report_dt()
the dependency
tippy
HTML Dependency for tippy
html_dependency_tippy()html_dependency_tippy()
the dependency
DT::datatable
HTML Dependency for vertical headers in DT::datatable
html_dependency_vert_dt()html_dependency_vert_dt()
the dependency
This function tests for unexpected elements and records, as well as duplicated identifiers and content. The unexpected element record check can be conducted by providing the number of expected records or an additional table with the expected records. It is possible to conduct the checks by study segments or to consider only selected segments.
int_all_datastructure_dataframe( meta_data_dataframe = "dataframe_level", item_level = "item_level", meta_data = item_level, meta_data_v2, dataframe_level )int_all_datastructure_dataframe( meta_data_dataframe = "dataframe_level", item_level = "item_level", meta_data = item_level, meta_data_v2, dataframe_level )
meta_data_dataframe |
data.frame the data frame that contains the metadata for the data frame level |
item_level |
data.frame the data frame that contains metadata attributes of study data |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
dataframe_level |
data.frame alias for |
a list with
DataframeTable: data frame with selected check results, used for the data quality report.
## Not run: out_dataframe <- int_all_datastructure_dataframe( meta_data_dataframe = "meta_data_dataframe", meta_data = "ship_meta" ) md0 <- prep_get_data_frame("ship_meta") md0 md0$VAR_NAMES md0$VAR_NAMES[[1]] <- "Id" # is this missmatch reported -- is the data frame # also reported, if nothing is wrong with it out_dataframe <- int_all_datastructure_dataframe( meta_data_dataframe = "meta_data_dataframe", meta_data = md0 ) # This is the "normal" procedure for inside pipeline # but outside this function checktype is exact by default options(dataquieR.ELEMENT_MISSMATCH_CHECKTYPE = "subset_u") lapply(setNames(nm = prep_get_data_frame("meta_data_dataframe")$DF_NAME), int_sts_element_dataframe, meta_data = md0) md0$VAR_NAMES[[1]] <- "id" # is this missmatch reported -- is the data frame also reported, # if nothing is wrong with it lapply(setNames(nm = prep_get_data_frame("meta_data_dataframe")$DF_NAME), int_sts_element_dataframe, meta_data = md0) options(dataquieR.ELEMENT_MISSMATCH_CHECKTYPE = "exact") ## End(Not run)## Not run: out_dataframe <- int_all_datastructure_dataframe( meta_data_dataframe = "meta_data_dataframe", meta_data = "ship_meta" ) md0 <- prep_get_data_frame("ship_meta") md0 md0$VAR_NAMES md0$VAR_NAMES[[1]] <- "Id" # is this missmatch reported -- is the data frame # also reported, if nothing is wrong with it out_dataframe <- int_all_datastructure_dataframe( meta_data_dataframe = "meta_data_dataframe", meta_data = md0 ) # This is the "normal" procedure for inside pipeline # but outside this function checktype is exact by default options(dataquieR.ELEMENT_MISSMATCH_CHECKTYPE = "subset_u") lapply(setNames(nm = prep_get_data_frame("meta_data_dataframe")$DF_NAME), int_sts_element_dataframe, meta_data = md0) md0$VAR_NAMES[[1]] <- "id" # is this missmatch reported -- is the data frame also reported, # if nothing is wrong with it lapply(setNames(nm = prep_get_data_frame("meta_data_dataframe")$DF_NAME), int_sts_element_dataframe, meta_data = md0) options(dataquieR.ELEMENT_MISSMATCH_CHECKTYPE = "exact") ## End(Not run)
This function tests for unexpected elements and records, as well as duplicated identifiers and content. The unexpected element record check can be conducted by providing the number of expected records or an additional table with the expected records. It is possible to conduct the checks by study segments or to consider only selected segments.
int_all_datastructure_segment( study_data, label_col, item_level = "item_level", meta_data = item_level, meta_data_v2, segment_level, meta_data_segment = "segment_level" )int_all_datastructure_segment( study_data, label_col, item_level = "item_level", meta_data = item_level, meta_data_v2, segment_level, meta_data_segment = "segment_level" )
study_data |
data.frame the data frame that contains the measurements |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
item_level |
data.frame the data frame that contains metadata attributes of study data |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
segment_level |
data.frame alias for |
meta_data_segment |
data.frame the data frame that contains the metadata for the segment level, mandatory |
a list with
SegmentTable: data frame with selected check results, used for the data quality report.
## Not run: out_segment <- int_all_datastructure_segment( meta_data_segment = "meta_data_segment", study_data = "ship", meta_data = "ship_meta" ) study_data <- cars meta_data <- dataquieR::prep_create_meta(VAR_NAMES = c("speedx", "distx"), DATA_TYPE = c("integer", "integer"), MISSING_LIST = "|", JUMP_LIST = "|", STUDY_SEGMENT = c("Intro", "Ex")) out_segment <- int_all_datastructure_segment( meta_data_segment = "meta_data_segment", study_data = study_data, meta_data = meta_data ) ## End(Not run)## Not run: out_segment <- int_all_datastructure_segment( meta_data_segment = "meta_data_segment", study_data = "ship", meta_data = "ship_meta" ) study_data <- cars meta_data <- dataquieR::prep_create_meta(VAR_NAMES = c("speedx", "distx"), DATA_TYPE = c("integer", "integer"), MISSING_LIST = "|", JUMP_LIST = "|", STUDY_SEGMENT = c("Intro", "Ex")) out_segment <- int_all_datastructure_segment( meta_data_segment = "meta_data_segment", study_data = study_data, meta_data = meta_data ) ## End(Not run)
Checks data types of the study data and for the data type declared in the metadata
int_datatype_matrix( resp_vars = NULL, study_data, label_col, item_level = "item_level", split_segments = FALSE, max_vars_per_plot = 20, threshold_value = 0, meta_data = item_level, meta_data_v2 )int_datatype_matrix( resp_vars = NULL, study_data, label_col, item_level = "item_level", split_segments = FALSE, max_vars_per_plot = 20, threshold_value = 0, meta_data = item_level, meta_data_v2 )
resp_vars |
variable the names of the measurement variables, if
missing or |
study_data |
data.frame the data frame that contains the measurements |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
item_level |
data.frame the data frame that contains metadata attributes of study data |
split_segments |
logical return one matrix per study segment |
max_vars_per_plot |
integer from=0. The maximum number of variables per single plot. |
threshold_value |
numeric from=0 to=100. percentage failing conversions allowed to still classify a study variable convertible. |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
This is a preparatory support function that compares study data with associated metadata. A prerequisite of this function is that the no. of columns in the study data complies with the no. of rows in the metadata.
For each study variable, the function searches for its data type declared in static metadata and returns a heatmap like matrix indicating data type mismatches in the study data.
List function.
a list with:
SummaryTable: data frame containing data quality check for
"data type mismatch" (CLS_int_vfe_type,
PCT_int_vfe_type). The following categories are possible:
categories: "Non-matching datatype",
"Non-Matching datatype, convertible",
"Matching datatype"
SummaryData: data frame containing data quality check for
"data type mismatch" for a report
SummaryPlot: ggplot2::ggplot2 heatmap plot, graphical representation of
SummaryTable
DataTypePlotList: list of plots per (maybe artificial) segment
ReportSummaryTable: data frame underlying SummaryPlot
This function tests for duplicates entries in the data set. It is possible to check duplicated entries by study segments or to consider only selected segments.
int_duplicate_content( level = c("dataframe", "segment"), study_data, item_level = "item_level", label_col, meta_data = item_level, meta_data_v2, ... )int_duplicate_content( level = c("dataframe", "segment"), study_data, item_level = "item_level", label_col, meta_data = item_level, meta_data_v2, ... )
level |
character a character vector indicating whether the assessment should be conducted at the study level (level = "dataframe") or at the segment level (level = "segment"). |
study_data |
data.frame the data frame that contains the measurements |
item_level |
data.frame the data frame that contains metadata attributes of study data |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
... |
Depending on |
a list. Depending on level, see
util_int_duplicate_content_segment or
util_int_duplicate_content_dataframe for a description of the outputs.
This function tests for duplicates entries in identifiers. It is possible to check duplicated identifiers by study segments or to consider only selected segments.
int_duplicate_ids( level = c("dataframe", "segment"), study_data, item_level = "item_level", label_col, meta_data = item_level, meta_data_v2, ... )int_duplicate_ids( level = c("dataframe", "segment"), study_data, item_level = "item_level", label_col, meta_data = item_level, meta_data_v2, ... )
level |
character a character vector indicating whether the assessment should be conducted at the study level (level = "dataframe") or at the segment level (level = "segment"). |
study_data |
data.frame the data frame that contains the measurements |
item_level |
data.frame the data frame that contains metadata attributes of study data |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
... |
Depending on |
a list. Depending on level, see
util_int_duplicate_ids_segment or
util_int_duplicate_ids_dataframe for a description of the outputs.
Detects errors in the character encoding of string variables
int_encoding_errors( resp_vars = NULL, study_data, label_col, meta_data_dataframe = "dataframe_level", item_level = "item_level", ref_encs, meta_data = item_level, meta_data_v2, dataframe_level )int_encoding_errors( resp_vars = NULL, study_data, label_col, meta_data_dataframe = "dataframe_level", item_level = "item_level", ref_encs, meta_data = item_level, meta_data_v2, dataframe_level )
resp_vars |
variable the names of the measurement variables, if
missing or |
study_data |
data.frame the data frame that contains the measurements |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
meta_data_dataframe |
data.frame the data frame that contains the metadata for the data frame level |
item_level |
data.frame the data frame that contains metadata attributes of study data |
ref_encs |
reference encodings (names are |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
dataframe_level |
data.frame alias for |
Strings are stored based on code tables, nowadays, typically as UTF-8. However, other code systems are still in use, so, sometimes, strings from different systems are mixed in the data. This indicator checks for such problems and returns the count of entries per variable, that do not match the reference coding system, which is estimated from the study data (addition of metadata field is planned).
If not specified in the metadata (columns ENCODING in item- or data-frame-
level, the encoding is guessed from the data). Otherwise, it may be any
supported encoding as returned by iconvlist().
a list with:
SummaryTable: data.frame with information on such problems
SummaryData: data.frame human readable version of SummaryTable
FlaggedStudyData: data.frame tells for each entry in study data if
its encoding is OK. has the same dimensions as
study_data
For each participant, check, if an observation was expected, given the
PART_VARS from item-level metadata
int_part_vars_structure( label_col, study_data, item_level = "item_level", expected_observations = c("HIERARCHY", "SEGMENT"), disclose_problem_paprt_var_data = FALSE, meta_data = item_level, meta_data_v2 )int_part_vars_structure( label_col, study_data, item_level = "item_level", expected_observations = c("HIERARCHY", "SEGMENT"), disclose_problem_paprt_var_data = FALSE, meta_data = item_level, meta_data_v2 )
label_col |
character mapping attribute |
study_data |
study_data must have all relevant |
item_level |
meta_data must be complete to avoid false positives on
non-existing |
expected_observations |
enum HIERARCHY | SEGMENT. How should
|
disclose_problem_paprt_var_data |
logical show the problematic data
( |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
empty list, so far – the function only warns.
Depends on dataquieR.ELEMENT_MISSMATCH_CHECKTYPE option, see there
int_sts_element_dataframe( item_level = "item_level", meta_data_dataframe = "dataframe_level", meta_data = item_level, meta_data_v2, check_type = getOption("dataquieR.ELEMENT_MISSMATCH_CHECKTYPE", dataquieR.ELEMENT_MISSMATCH_CHECKTYPE_default), dataframe_level )int_sts_element_dataframe( item_level = "item_level", meta_data_dataframe = "dataframe_level", meta_data = item_level, meta_data_v2, check_type = getOption("dataquieR.ELEMENT_MISSMATCH_CHECKTYPE", dataquieR.ELEMENT_MISSMATCH_CHECKTYPE_default), dataframe_level )
item_level |
data.frame the data frame that contains metadata attributes of study data |
meta_data_dataframe |
data.frame the data frame that contains the metadata for the data frame level |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
check_type |
enum none | exact | subset_u | subset_m. See dataquieR.ELEMENT_MISSMATCH_CHECKTYPE |
dataframe_level |
data.frame alias for |
list with names lots:
DataframeData: data frame with the unexpected elements check results.
DataframeTable: data.frame table with all errors, used for the data quality report:
- PCT_int_sts_element: Percentage of element
mismatches
- NUM_int_sts_element: Number of element
mismatches
- resp_vars: affected element names
## Not run: prep_load_workbook_like_file("~/tmp/df_level_test.xlsx") meta_data_dataframe <- "dataframe_level" meta_data <- "item_level" ## End(Not run)## Not run: prep_load_workbook_like_file("~/tmp/df_level_test.xlsx") meta_data_dataframe <- "dataframe_level" meta_data <- "item_level" ## End(Not run)
Depends on dataquieR.ELEMENT_MISSMATCH_CHECKTYPE option,
see there – # TODO: Rind out, how to document and link
it here using Roxygen.
int_sts_element_segment( study_data, item_level = "item_level", label_col, meta_data = item_level, meta_data_v2 )int_sts_element_segment( study_data, item_level = "item_level", label_col, meta_data = item_level, meta_data_v2 )
study_data |
data.frame the data frame that contains the measurements, mandatory. |
item_level |
data.frame the data frame that contains metadata attributes of study data |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
a list with
SegmentData: data frame with the unexpected elements check results.
- Segment: name of the corresponding segment,
if applicable, ALL otherwise
SegmentTable: data frame with the unexpected elements check results, used for the data quality report.
- Segment: name of the corresponding segment,
if applicable, ALL otherwise
## Not run: study_data <- cars meta_data <- dataquieR::prep_create_meta(VAR_NAMES = c("speedx", "distx"), DATA_TYPE = c("integer", "integer"), MISSING_LIST = "|", JUMP_LIST = "|", STUDY_SEGMENT = c("Intro", "Ex")) options(dataquieR.ELEMENT_MISSMATCH_CHECKTYPE = "none") int_sts_element_segment(study_data, meta_data) options(dataquieR.ELEMENT_MISSMATCH_CHECKTYPE = "exact") int_sts_element_segment(study_data, meta_data) study_data <- cars meta_data <- dataquieR::prep_create_meta(VAR_NAMES = c("speedx", "distx"), DATA_TYPE = c("integer", "integer"), MISSING_LIST = "|", JUMP_LIST = "|", STUDY_SEGMENT = c("Intro", "Intro")) options(dataquieR.ELEMENT_MISSMATCH_CHECKTYPE = "none") int_sts_element_segment(study_data, meta_data) options(dataquieR.ELEMENT_MISSMATCH_CHECKTYPE = "exact") int_sts_element_segment(study_data, meta_data) study_data <- cars meta_data <- dataquieR::prep_create_meta(VAR_NAMES = c("speed", "distx"), DATA_TYPE = c("integer", "integer"), MISSING_LIST = "|", JUMP_LIST = "|", STUDY_SEGMENT = c("Intro", "Intro")) options(dataquieR.ELEMENT_MISSMATCH_CHECKTYPE = "none") int_sts_element_segment(study_data, meta_data) options(dataquieR.ELEMENT_MISSMATCH_CHECKTYPE = "exact") int_sts_element_segment(study_data, meta_data) ## End(Not run)## Not run: study_data <- cars meta_data <- dataquieR::prep_create_meta(VAR_NAMES = c("speedx", "distx"), DATA_TYPE = c("integer", "integer"), MISSING_LIST = "|", JUMP_LIST = "|", STUDY_SEGMENT = c("Intro", "Ex")) options(dataquieR.ELEMENT_MISSMATCH_CHECKTYPE = "none") int_sts_element_segment(study_data, meta_data) options(dataquieR.ELEMENT_MISSMATCH_CHECKTYPE = "exact") int_sts_element_segment(study_data, meta_data) study_data <- cars meta_data <- dataquieR::prep_create_meta(VAR_NAMES = c("speedx", "distx"), DATA_TYPE = c("integer", "integer"), MISSING_LIST = "|", JUMP_LIST = "|", STUDY_SEGMENT = c("Intro", "Intro")) options(dataquieR.ELEMENT_MISSMATCH_CHECKTYPE = "none") int_sts_element_segment(study_data, meta_data) options(dataquieR.ELEMENT_MISSMATCH_CHECKTYPE = "exact") int_sts_element_segment(study_data, meta_data) study_data <- cars meta_data <- dataquieR::prep_create_meta(VAR_NAMES = c("speed", "distx"), DATA_TYPE = c("integer", "integer"), MISSING_LIST = "|", JUMP_LIST = "|", STUDY_SEGMENT = c("Intro", "Intro")) options(dataquieR.ELEMENT_MISSMATCH_CHECKTYPE = "none") int_sts_element_segment(study_data, meta_data) options(dataquieR.ELEMENT_MISSMATCH_CHECKTYPE = "exact") int_sts_element_segment(study_data, meta_data) ## End(Not run)
This function contrasts the expected element number in each study in the metadata with the actual element number in each study data frame.
int_unexp_elements( identifier_name_list, data_element_count, meta_data_dataframe = "dataframe_level", meta_data_v2, dataframe_level )int_unexp_elements( identifier_name_list, data_element_count, meta_data_dataframe = "dataframe_level", meta_data_v2, dataframe_level )
identifier_name_list |
character a character vector indicating the name of each study data frame, mandatory. |
data_element_count |
integer an integer vector with the number of expected data elements, mandatory. |
meta_data_dataframe |
data.frame the data frame that contains the metadata for the data frame level |
meta_data_v2 |
character path to workbook like metadata file, see
|
dataframe_level |
data.frame alias for |
a list with
DataframeData: data frame with the results of the quality check for unexpected data elements
DataframeTable: data frame with selected unexpected data elements check results, used for the data quality report.
This function contrasts the expected record number in each study in the metadata with the actual record number in each study data frame.
int_unexp_records_dataframe( identifier_name_list, data_record_count, meta_data_dataframe = "dataframe_level", meta_data_v2, dataframe_level )int_unexp_records_dataframe( identifier_name_list, data_record_count, meta_data_dataframe = "dataframe_level", meta_data_v2, dataframe_level )
identifier_name_list |
character a character vector indicating the name of each study data frame, mandatory. |
data_record_count |
integer an integer vector with the number of expected data records per study data frame, mandatory. |
meta_data_dataframe |
data.frame the data frame that contains the metadata for the data frame level |
meta_data_v2 |
character path to workbook like metadata file, see
|
dataframe_level |
data.frame alias for |
a list with
DataframeData: data frame with the results of the quality check for unexpected data elements
DataframeTable: data frame with selected unexpected data elements check results, used for the data quality report.
This function contrasts the expected record number in each study segment in the metadata with the actual record number in each segment data frame.
int_unexp_records_segment( study_segment, study_data, label_col, item_level = "item_level", data_record_count, meta_data = item_level, meta_data_segment = "segment_level", meta_data_v2, segment_level )int_unexp_records_segment( study_segment, study_data, label_col, item_level = "item_level", data_record_count, meta_data = item_level, meta_data_segment = "segment_level", meta_data_v2, segment_level )
study_segment |
character a character vector indicating the name of each study data frame, mandatory. |
study_data |
data.frame the data frame that contains the measurements |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
item_level |
data.frame the data frame that contains metadata attributes of study data |
data_record_count |
integer an integer vector with the number of expected data records, mandatory. |
meta_data |
data.frame old name for |
meta_data_segment |
data.frame – optional: Segment level metadata |
meta_data_v2 |
character path to workbook like metadata file, see
|
segment_level |
data.frame alias for |
The current implementation does not take into account jump or missing codes, the function is rather based on checking whether NAs are present in the study data
a list with
SegmentData: data frame with the results of the quality check for unexpected data elements
SegmentTable: data frame with selected unexpected data elements check results, used for the data quality report.
This function tests that the identifiers match a provided record set. It is possible to check for unexpected data record sets by study segments or to consider only selected segments.
int_unexp_records_set( level = c("dataframe", "segment"), study_data, item_level = "item_level", label_col, meta_data = item_level, meta_data_v2, ... )int_unexp_records_set( level = c("dataframe", "segment"), study_data, item_level = "item_level", label_col, meta_data = item_level, meta_data_v2, ... )
level |
character a character vector indicating whether the assessment should be conducted at the study level (level = "dataframe") or at the segment level (level = "segment"). |
study_data |
data.frame the data frame that contains the measurements |
item_level |
data.frame the data frame that contains metadata attributes of study data |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
... |
Depending on |
a list. Depending on level, see
util_int_unexp_records_set_segment or
util_int_unexp_records_set_dataframe for a description of the outputs.
TODO
Other meta_data_cross:
ASSOCIATION_DIRECTION,
ASSOCIATION_FORM,
ASSOCIATION_METRIC,
ASSOCIATION_RANGE,
CHECK_ID,
CHECK_LABEL,
COMPUTED_VARIABLE_ROLES,
CONTRADICTION_TERM,
CONTRADICTION_TYPE,
DATA_PREPARATION,
GOLDSTANDARD,
MAHALANOBIS_RATIO,
MAHALANOBIS_THRESHOLD,
MAXIMUM_LONG_STRING,
MISS_RESP,
MULTIVARIATE_OUTLIER_CHECK,
MULTIVARIATE_OUTLIER_CHECKTYPE,
RELCOMPL_SPEED,
REL_VAL,
RESPT_PER_ITEM,
SCALE_ACRONYM,
SCALE_NAME,
TOTRESPT,
VARIABLE_LIST,
VARIABLE_LIST_ORDER,
meta_data_computation,
meta_data_cross
Other SSI:
COMPUTED_VARIABLE_ROLES,
MAHALANOBIS_RATIO,
MAXIMUM_LONG_STRING,
MISS_RESP,
RELCOMPL_SPEED,
RESPT_PER_ITEM,
TOTRESPT
TODO
Other meta_data_cross:
ASSOCIATION_DIRECTION,
ASSOCIATION_FORM,
ASSOCIATION_METRIC,
ASSOCIATION_RANGE,
CHECK_ID,
CHECK_LABEL,
COMPUTED_VARIABLE_ROLES,
CONTRADICTION_TERM,
CONTRADICTION_TYPE,
DATA_PREPARATION,
GOLDSTANDARD,
IRV,
MAHALANOBIS_THRESHOLD,
MAXIMUM_LONG_STRING,
MISS_RESP,
MULTIVARIATE_OUTLIER_CHECK,
MULTIVARIATE_OUTLIER_CHECKTYPE,
RELCOMPL_SPEED,
REL_VAL,
RESPT_PER_ITEM,
SCALE_ACRONYM,
SCALE_NAME,
TOTRESPT,
VARIABLE_LIST,
VARIABLE_LIST_ORDER,
meta_data_computation,
meta_data_cross
Other SSI:
COMPUTED_VARIABLE_ROLES,
IRV,
MAXIMUM_LONG_STRING,
MISS_RESP,
RELCOMPL_SPEED,
RESPT_PER_ITEM,
TOTRESPT
Select, whether to compute acc_mahalanobis.
MAHALANOBIS_THRESHOLDMAHALANOBIS_THRESHOLD
You can leave the cell empty, then the depends on the setting of the
option dataquieR.MULTIVARIATE_OUTLIER_CHECK. If this column is missing,
all this is the same as having all cells empty and
dataquieR.MULTIVARIATE_OUTLIER_CHECK set to "auto".
See also MULTIVARIATE_OUTLIER_CHECKTYPE.
Other meta_data_cross:
ASSOCIATION_DIRECTION,
ASSOCIATION_FORM,
ASSOCIATION_METRIC,
ASSOCIATION_RANGE,
CHECK_ID,
CHECK_LABEL,
COMPUTED_VARIABLE_ROLES,
CONTRADICTION_TERM,
CONTRADICTION_TYPE,
DATA_PREPARATION,
GOLDSTANDARD,
IRV,
MAHALANOBIS_RATIO,
MAXIMUM_LONG_STRING,
MISS_RESP,
MULTIVARIATE_OUTLIER_CHECK,
MULTIVARIATE_OUTLIER_CHECKTYPE,
RELCOMPL_SPEED,
REL_VAL,
RESPT_PER_ITEM,
SCALE_ACRONYM,
SCALE_NAME,
TOTRESPT,
VARIABLE_LIST,
VARIABLE_LIST_ORDER,
meta_data_computation,
meta_data_cross
TODO
Other meta_data_cross:
ASSOCIATION_DIRECTION,
ASSOCIATION_FORM,
ASSOCIATION_METRIC,
ASSOCIATION_RANGE,
CHECK_ID,
CHECK_LABEL,
COMPUTED_VARIABLE_ROLES,
CONTRADICTION_TERM,
CONTRADICTION_TYPE,
DATA_PREPARATION,
GOLDSTANDARD,
IRV,
MAHALANOBIS_RATIO,
MAHALANOBIS_THRESHOLD,
MISS_RESP,
MULTIVARIATE_OUTLIER_CHECK,
MULTIVARIATE_OUTLIER_CHECKTYPE,
RELCOMPL_SPEED,
REL_VAL,
RESPT_PER_ITEM,
SCALE_ACRONYM,
SCALE_NAME,
TOTRESPT,
VARIABLE_LIST,
VARIABLE_LIST_ORDER,
meta_data_computation,
meta_data_cross
Other SSI:
COMPUTED_VARIABLE_ROLES,
IRV,
MAHALANOBIS_RATIO,
MISS_RESP,
RELCOMPL_SPEED,
RESPT_PER_ITEM,
TOTRESPT
Variable level metadata.
further details on variable level metadata.
item_computation_level sheetComputation rules TODO
Other meta_data_cross:
ASSOCIATION_DIRECTION,
ASSOCIATION_FORM,
ASSOCIATION_METRIC,
ASSOCIATION_RANGE,
CHECK_ID,
CHECK_LABEL,
COMPUTED_VARIABLE_ROLES,
CONTRADICTION_TERM,
CONTRADICTION_TYPE,
DATA_PREPARATION,
GOLDSTANDARD,
IRV,
MAHALANOBIS_RATIO,
MAHALANOBIS_THRESHOLD,
MAXIMUM_LONG_STRING,
MISS_RESP,
MULTIVARIATE_OUTLIER_CHECK,
MULTIVARIATE_OUTLIER_CHECKTYPE,
RELCOMPL_SPEED,
REL_VAL,
RESPT_PER_ITEM,
SCALE_ACRONYM,
SCALE_NAME,
TOTRESPT,
VARIABLE_LIST,
VARIABLE_LIST_ORDER,
meta_data_cross
cross-item_level sheetMetadata describing groups of variables, e.g., for their multivariate distribution or for defining contradiction rules.
Other meta_data_cross:
ASSOCIATION_DIRECTION,
ASSOCIATION_FORM,
ASSOCIATION_METRIC,
ASSOCIATION_RANGE,
CHECK_ID,
CHECK_LABEL,
COMPUTED_VARIABLE_ROLES,
CONTRADICTION_TERM,
CONTRADICTION_TYPE,
DATA_PREPARATION,
GOLDSTANDARD,
IRV,
MAHALANOBIS_RATIO,
MAHALANOBIS_THRESHOLD,
MAXIMUM_LONG_STRING,
MISS_RESP,
MULTIVARIATE_OUTLIER_CHECK,
MULTIVARIATE_OUTLIER_CHECKTYPE,
RELCOMPL_SPEED,
REL_VAL,
RESPT_PER_ITEM,
SCALE_ACRONYM,
SCALE_NAME,
TOTRESPT,
VARIABLE_LIST,
VARIABLE_LIST_ORDER,
meta_data_computation
meta_data_dataframe sheetMetadata describing data delivered on one data frame/table sheet, e.g., a full questionnaire, not its items.
meta_data_segment sheetMetadata describing study segments, e.g., a full questionnaire, not its items.
TODO
Other meta_data_cross:
ASSOCIATION_DIRECTION,
ASSOCIATION_FORM,
ASSOCIATION_METRIC,
ASSOCIATION_RANGE,
CHECK_ID,
CHECK_LABEL,
COMPUTED_VARIABLE_ROLES,
CONTRADICTION_TERM,
CONTRADICTION_TYPE,
DATA_PREPARATION,
GOLDSTANDARD,
IRV,
MAHALANOBIS_RATIO,
MAHALANOBIS_THRESHOLD,
MAXIMUM_LONG_STRING,
MULTIVARIATE_OUTLIER_CHECK,
MULTIVARIATE_OUTLIER_CHECKTYPE,
RELCOMPL_SPEED,
REL_VAL,
RESPT_PER_ITEM,
SCALE_ACRONYM,
SCALE_NAME,
TOTRESPT,
VARIABLE_LIST,
VARIABLE_LIST_ORDER,
meta_data_computation,
meta_data_cross
Other SSI:
COMPUTED_VARIABLE_ROLES,
IRV,
MAHALANOBIS_RATIO,
MAXIMUM_LONG_STRING,
RELCOMPL_SPEED,
RESPT_PER_ITEM,
TOTRESPT
Name of the sheet with rules to introduce missing codes in the pipeline
MISSING_CODE_RULESMISSING_CODE_RULES
Select, whether to compute acc_multivariate_outlier.
MULTIVARIATE_OUTLIER_CHECKMULTIVARIATE_OUTLIER_CHECK
You can leave the cell empty, then the depends on the setting of the
option dataquieR.MULTIVARIATE_OUTLIER_CHECK. If this column is missing,
all this is the same as having all cells empty and
dataquieR.MULTIVARIATE_OUTLIER_CHECK set to "auto".
See also MULTIVARIATE_OUTLIER_CHECKTYPE.
Other meta_data_cross:
ASSOCIATION_DIRECTION,
ASSOCIATION_FORM,
ASSOCIATION_METRIC,
ASSOCIATION_RANGE,
CHECK_ID,
CHECK_LABEL,
COMPUTED_VARIABLE_ROLES,
CONTRADICTION_TERM,
CONTRADICTION_TYPE,
DATA_PREPARATION,
GOLDSTANDARD,
IRV,
MAHALANOBIS_RATIO,
MAHALANOBIS_THRESHOLD,
MAXIMUM_LONG_STRING,
MISS_RESP,
MULTIVARIATE_OUTLIER_CHECKTYPE,
RELCOMPL_SPEED,
REL_VAL,
RESPT_PER_ITEM,
SCALE_ACRONYM,
SCALE_NAME,
TOTRESPT,
VARIABLE_LIST,
VARIABLE_LIST_ORDER,
meta_data_computation,
meta_data_cross
Select, which outlier criteria to compute, see acc_multivariate_outlier.
MULTIVARIATE_OUTLIER_CHECKTYPEMULTIVARIATE_OUTLIER_CHECKTYPE
You can leave the cell empty, then, all checks will apply. If you enter
a set of methods, the maximum for N_RULES changes. See also
UNIVARIATE_OUTLIER_CHECKTYPE.
Other meta_data_cross:
ASSOCIATION_DIRECTION,
ASSOCIATION_FORM,
ASSOCIATION_METRIC,
ASSOCIATION_RANGE,
CHECK_ID,
CHECK_LABEL,
COMPUTED_VARIABLE_ROLES,
CONTRADICTION_TERM,
CONTRADICTION_TYPE,
DATA_PREPARATION,
GOLDSTANDARD,
IRV,
MAHALANOBIS_RATIO,
MAHALANOBIS_THRESHOLD,
MAXIMUM_LONG_STRING,
MISS_RESP,
MULTIVARIATE_OUTLIER_CHECK,
RELCOMPL_SPEED,
REL_VAL,
RESPT_PER_ITEM,
SCALE_ACRONYM,
SCALE_NAME,
TOTRESPT,
VARIABLE_LIST,
VARIABLE_LIST_ORDER,
meta_data_computation,
meta_data_cross
names implementation for the class dataquieR_translated
dataquieR's translated texts featuring access to the language keys, still.
this function returns the language keys.
## S3 replacement method for class 'dataquieR_translated' names(x) <- value## S3 replacement method for class 'dataquieR_translated' names(x) <- value
x |
|
value |
the names to assign |
only setNames(nm = x) is allowed for convenience. Any other assignment
would mean to change the language keys, so this is not allowed.
names of the underlying character vector
base::as.character
return the number of result slots in a report
nres(x)nres(x)
x |
the |
the number of used result slots
Deprecated
pipeline_recursive_result(...)pipeline_recursive_result(...)
... |
Deprecated |
Deprecated
Deprecated
pipeline_vectorized(...)pipeline_vectorized(...)
... |
Deprecated |
Deprecated
dataquieR summaryPlot a dataquieR summary
## S3 method for class 'dataquieR_summary' plot( x, y, ..., filter, dont_plot = FALSE, stratify_by, vars_to_include = "study", disable_plotly = FALSE, hierarchy, folder_of_report = NULL, var_uniquenames = NULL )## S3 method for class 'dataquieR_summary' plot( x, y, ..., filter, dont_plot = FALSE, stratify_by, vars_to_include = "study", disable_plotly = FALSE, hierarchy, folder_of_report = NULL, var_uniquenames = NULL )
x |
the |
y |
not yet used |
... |
not yet used |
filter |
if given, this filters the summary, e.g.,
|
dont_plot |
suppress the actual plotting, just return a printable
object derived from |
stratify_by |
column to stratify the summary, may be one string. |
vars_to_include |
character study | |
disable_plotly |
logical do not use |
hierarchy |
not yet defined, but if an argument is given, a
sunburst chart is displayed, currently, only |
folder_of_report |
a named vector with the location of variable and
|
var_uniquenames |
a data frame with the original variable names and the unique names in case of reports created with dq_report_by containing the same variable in several reports (e.g., creation of reports by sex) |
invisible html object
Data quality indicator checks "Unexpected location" with histograms and plots of empirical cumulative distributions for the subgroups.
prep_acc_distributions_with_ecdf( resp_vars = NULL, group_vars = NULL, study_data, label_col, item_level = "item_level", meta_data = item_level, meta_data_v2, n_group_max = getOption("dataquieR.max_group_var_levels_in_plot", dataquieR.max_group_var_levels_in_plot_default), n_obs_per_group_min = getOption("dataquieR.min_obs_per_group_var_in_plot", dataquieR.min_obs_per_group_var_in_plot_default) )prep_acc_distributions_with_ecdf( resp_vars = NULL, group_vars = NULL, study_data, label_col, item_level = "item_level", meta_data = item_level, meta_data_v2, n_group_max = getOption("dataquieR.max_group_var_levels_in_plot", dataquieR.max_group_var_levels_in_plot_default), n_obs_per_group_min = getOption("dataquieR.min_obs_per_group_var_in_plot", dataquieR.min_obs_per_group_var_in_plot_default) )
resp_vars |
variable list the name of the measurement variable |
group_vars |
variable list the name of the observer, device or reader variable |
study_data |
data.frame the data frame that contains the measurements |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
item_level |
data.frame the data frame that contains metadata attributes of study data |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
n_group_max |
maximum number of categories to be displayed individually
for the grouping variable ( |
n_obs_per_group_min |
minimum number of data points per group to create
a graph for an individual category of the |
A SummaryPlot.
The function has to working modes. If replace_meta_data is TRUE, by
default, if cause_label_df contains a column
named resp_vars, then the missing/jump codes in
meta_data[, c(MISSING_CODES, JUMP_CODES)] will be overwritten, otherwise,
it will be labeled using the cause_label_df.
prep_add_cause_label_df( item_level = "item_level", cause_label_df, label_col = VAR_NAMES, assume_consistent_codes = TRUE, replace_meta_data = ("resp_vars" %in% colnames(cause_label_df)), meta_data = item_level, meta_data_v2 )prep_add_cause_label_df( item_level = "item_level", cause_label_df, label_col = VAR_NAMES, assume_consistent_codes = TRUE, replace_meta_data = ("resp_vars" %in% colnames(cause_label_df)), meta_data = item_level, meta_data_v2 )
item_level |
data.frame the data frame that contains metadata attributes of study data |
cause_label_df |
data.frame missing code table. If missing codes have labels the respective data frame can be specified here, see cause_label_df |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
assume_consistent_codes |
logical if TRUE and no labels are given and the same missing/jump code is used for more than one variable, the labels assigned for this code will be the same for all variables. |
replace_meta_data |
logical if |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
If a column resp_vars exists, then rows with a value in resp_vars will
only be used for the corresponding variable.
data.frame updated metadata including all the code labels in missing/jump lists
NAs based on rulesInsert missing codes for NAs based on rules
prep_add_computed_variables( study_data, meta_data, label_col, rules, use_value_labels )prep_add_computed_variables( study_data, meta_data, label_col, rules, use_value_labels )
study_data |
data.frame the data frame that contains the measurements |
meta_data |
data.frame the data frame that contains metadata attributes of study data |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
rules |
data.frame with the columns:
|
use_value_labels |
logical In rules for factors, use the value labels,
not the codes. Defaults to |
a list with the entry:
ModifiedStudyData: Study data with the new variables
## Not run: study_data <- prep_get_data_frame("ship") prep_load_workbook_like_file("ship_meta_v2") meta_data <- prep_get_data_frame("item_level") rules <- tibble::tribble( ~VAR_NAMES, ~COMPUTATION_RULE, "BMI", '[BODY_WEIGHT_0]/(([BODY_HEIGHT_0]/100)^2)', "R", '[WAIST_CIRC_0]/2/[pi]', # in m^3 "VOL_EST", '[pi]*([WAIST_CIRC_0]/2/[pi])^2*[BODY_HEIGHT_0] / 1000', # in l ) r <- prep_add_computed_variables(study_data, meta_data, label_col = "LABEL", rules, use_value_labels = FALSE) ## End(Not run)## Not run: study_data <- prep_get_data_frame("ship") prep_load_workbook_like_file("ship_meta_v2") meta_data <- prep_get_data_frame("item_level") rules <- tibble::tribble( ~VAR_NAMES, ~COMPUTATION_RULE, "BMI", '[BODY_WEIGHT_0]/(([BODY_HEIGHT_0]/100)^2)', "R", '[WAIST_CIRC_0]/2/[pi]', # in m^3 "VOL_EST", '[pi]*([WAIST_CIRC_0]/2/[pi])^2*[BODY_HEIGHT_0] / 1000', # in l ) r <- prep_add_computed_variables(study_data, meta_data, label_col = "LABEL", rules, use_value_labels = FALSE) ## End(Not run)
These can be referred to by their names, then, wherever dataquieR expects
a data.frame – just pass a character instead. If this character is not
found, dataquieR would additionally look for files with the name and for
URLs. You can also refer to specific sheets of a workbook or specific
object from an RData by appending a pipe symbol and its name. A second
pipe symbol allows to extract certain columns from such sheets (but
they will remain data frames).
prep_add_data_frames(..., data_frame_list = list(), append = FALSE)prep_add_data_frames(..., data_frame_list = list(), append = FALSE)
... |
data frames, if passed with names, these will be the names of these tables in the data frame environment. If not, then the names in the calling environment will be used. |
data_frame_list |
a named list with data frames. Also these will be
added and names will be handled as for the |
append |
logical if a data frame already exists in the cache (by name), extend the existing one |
data.frame invisible(the cache environment)
Other data-frame-cache:
prep_get_data_frame(),
prep_list_dataframes(),
prep_load_folder_with_metadata(),
prep_load_workbook_like_file(),
prep_purge_data_frame_cache(),
prep_remove_from_cache()
NAs based on rulesInsert missing codes for NAs based on rules
prep_add_missing_codes( resp_vars, study_data, meta_data_v2, item_level = "item_level", label_col, rules, use_value_labels = NA, overwrite = FALSE, meta_data = item_level )prep_add_missing_codes( resp_vars, study_data, meta_data_v2, item_level = "item_level", label_col, rules, use_value_labels = NA, overwrite = FALSE, meta_data = item_level )
resp_vars |
variable list the name of the measurement variables to be
modified, all from |
study_data |
data.frame the data frame that contains the measurements |
meta_data_v2 |
character path to workbook like metadata file, see
|
item_level |
data.frame the data frame that contains metadata attributes of study data |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
rules |
data.frame with the columns:
|
use_value_labels |
logical In rules for factors, use the value labels,
not the codes. Defaults to |
overwrite |
logical Also insert missing codes, if the values are not
|
meta_data |
data.frame old name for |
a list with the entries:
ModifiedStudyData: Study data with NAs replaced by the CODE_VALUE
ModifiedMetaData: Metadata having the new codes amended in the columns
JUMP_LIST or MISSING_LIST, respectively
adds an annotation to static metadata
prep_add_to_meta( VAR_NAMES, DATA_TYPE, LABEL, VALUE_LABELS, item_level = "item_level", meta_data = item_level, meta_data_v2, ... )prep_add_to_meta( VAR_NAMES, DATA_TYPE, LABEL, VALUE_LABELS, item_level = "item_level", meta_data = item_level, meta_data_v2, ... )
VAR_NAMES |
character Names of the Variables to add |
DATA_TYPE |
character Data type for the added variables |
LABEL |
character Labels for these variables |
VALUE_LABELS |
character Value labels for the values of the variables
as usually pipe separated and assigned with
|
item_level |
data.frame the metadata to extend |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
... |
Further defined variable attributes, see prep_create_meta |
Add metadata e.g. of transformed/new variable This function is not yet considered stable, but we already export it, because it could help. Therefore, we have some inconsistencies in the formals still.
a data frame with amended metadata.
meta_data
Re-Code labels with their respective codes according to the meta_data
prep_apply_coding( study_data, meta_data_v2, item_level = "item_level", meta_data = item_level )prep_apply_coding( study_data, meta_data_v2, item_level = "item_level", meta_data = item_level )
study_data |
data.frame the data frame that contains the measurements |
meta_data_v2 |
character path to workbook like metadata file, see
|
item_level |
data.frame the data frame that contains metadata attributes of study data |
meta_data |
data.frame old name for |
data.frame modified study data with labels replaced by the codes
Check for package updates
prep_check_for_dataquieR_updates( beta = FALSE, deps = TRUE, ask = interactive() )prep_check_for_dataquieR_updates( beta = FALSE, deps = TRUE, ask = interactive() )
beta |
logical check for beta version too |
deps |
logical check for missing (optional) dependencies |
ask |
logical ask for updates |
invisible(NULL)
if possible, mismatching data types are converted ("true" becomes TRUE)
prep_check_meta_data_dataframe( meta_data_dataframe = "dataframe_level", meta_data_v2, dataframe_level )prep_check_meta_data_dataframe( meta_data_dataframe = "dataframe_level", meta_data_v2, dataframe_level )
meta_data_dataframe |
data.frame data frame or path/url of a metadata sheet for the data frame level |
meta_data_v2 |
character path to workbook like metadata file, see
|
dataframe_level |
data.frame alias for |
missing columns are added, filled with NA, if this is valid, i.e., n.a.
for DF_NAME as the key column
standardized metadata sheet as data frame
## Not run: mds <- prep_check_meta_data_dataframe("ship_meta_dataframe|dataframe_level") # also converts print(mds) prep_check_meta_data_dataframe(mds) mds1 <- mds mds1$DF_RECORD_COUNT <- NULL print(prep_check_meta_data_dataframe(mds1)) # fixes the missing column by NAs mds1 <- mds mds1$DF_UNIQUE_ROWS[[2]] <- "xxx" # not convertible # print(prep_check_meta_data_dataframe(mds1)) # fail mds1 <- mds mds1$DF_UNIQUE_ID[[2]] <- 12 # print(prep_check_meta_data_dataframe(mds1)) # fail ## End(Not run)## Not run: mds <- prep_check_meta_data_dataframe("ship_meta_dataframe|dataframe_level") # also converts print(mds) prep_check_meta_data_dataframe(mds) mds1 <- mds mds1$DF_RECORD_COUNT <- NULL print(prep_check_meta_data_dataframe(mds1)) # fixes the missing column by NAs mds1 <- mds mds1$DF_UNIQUE_ROWS[[2]] <- "xxx" # not convertible # print(prep_check_meta_data_dataframe(mds1)) # fail mds1 <- mds mds1$DF_UNIQUE_ID[[2]] <- 12 # print(prep_check_meta_data_dataframe(mds1)) # fail ## End(Not run)
if possible, mismatching data types are converted ("true" becomes TRUE)
prep_check_meta_data_segment( meta_data_segment = "segment_level", meta_data_v2, segment_level )prep_check_meta_data_segment( meta_data_segment = "segment_level", meta_data_v2, segment_level )
meta_data_segment |
data.frame data frame or path/url of a metadata sheet for the segment level |
meta_data_v2 |
character path to workbook like metadata file, see
|
segment_level |
data.frame alias for |
missing columns are added, filled with NA, if this is valid, i.e., n.a.
for STUDY_SEGMENT as the key column
standardized metadata sheet as data frame
## Not run: mds <- prep_check_meta_data_segment("ship_meta_v2|segment_level") # also converts print(mds) prep_check_meta_data_segment(mds) mds1 <- mds mds1$SEGMENT_RECORD_COUNT <- NULL print(prep_check_meta_data_segment(mds1)) # fixes the missing column by NAs mds1 <- mds mds1$SEGMENT_UNIQUE_ROWS[[2]] <- "xxx" # not convertible # print(prep_check_meta_data_segment(mds1)) # fail ## End(Not run)## Not run: mds <- prep_check_meta_data_segment("ship_meta_v2|segment_level") # also converts print(mds) prep_check_meta_data_segment(mds) mds1 <- mds mds1$SEGMENT_RECORD_COUNT <- NULL print(prep_check_meta_data_segment(mds1)) # fixes the missing column by NAs mds1 <- mds mds1$SEGMENT_UNIQUE_ROWS[[2]] <- "xxx" # not convertible # print(prep_check_meta_data_segment(mds1)) # fail ## End(Not run)
This function verifies, if a data frame complies to metadata conventions and
provides a given richness of meta information as specified by level.
prep_check_meta_names( item_level = "item_level", level, character.only = FALSE, meta_data = item_level, meta_data_v2 )prep_check_meta_names( item_level = "item_level", level, character.only = FALSE, meta_data = item_level, meta_data_v2 )
item_level |
data.frame the data frame that contains metadata attributes of study data |
level |
enum level of requirement (see also VARATT_REQUIRE_LEVELS).
set to |
character.only |
logical a logical indicating whether level can be assumed to be character strings. |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
Note, that only the given level is checked despite, levels are somehow hierarchical.
a logical with:
invisible(TRUE). In case of problems with the metadata, a condition is
raised (stop()).
## Not run: prep_check_meta_names(data.frame(VAR_NAMES = 1, DATA_TYPE = 2, MISSING_LIST = 3)) prep_check_meta_names( data.frame( VAR_NAMES = 1, DATA_TYPE = 2, MISSING_LIST = 3, LABEL = "LABEL", VALUE_LABELS = "VALUE_LABELS", JUMP_LIST = "JUMP_LIST", HARD_LIMITS = "HARD_LIMITS", GROUP_VAR_OBSERVER = "GROUP_VAR_OBSERVER", GROUP_VAR_DEVICE = "GROUP_VAR_DEVICE", TIME_VAR = "TIME_VAR", PART_VAR = "PART_VAR", STUDY_SEGMENT = "STUDY_SEGMENT", LOCATION_RANGE = "LOCATION_RANGE", LOCATION_METRIC = "LOCATION_METRIC", PROPORTION_RANGE = "PROPORTION_RANGE", MISSING_LIST_TABLE = "MISSING_LIST_TABLE", CO_VARS = "CO_VARS", LONG_LABEL = "LONG_LABEL" ), RECOMMENDED ) prep_check_meta_names( data.frame( VAR_NAMES = 1, DATA_TYPE = 2, MISSING_LIST = 3, LABEL = "LABEL", VALUE_LABELS = "VALUE_LABELS", JUMP_LIST = "JUMP_LIST", HARD_LIMITS = "HARD_LIMITS", GROUP_VAR_OBSERVER = "GROUP_VAR_OBSERVER", GROUP_VAR_DEVICE = "GROUP_VAR_DEVICE", TIME_VAR = "TIME_VAR", PART_VAR = "PART_VAR", STUDY_SEGMENT = "STUDY_SEGMENT", LOCATION_RANGE = "LOCATION_RANGE", LOCATION_METRIC = "LOCATION_METRIC", PROPORTION_RANGE = "PROPORTION_RANGE", DETECTION_LIMITS = "DETECTION_LIMITS", SOFT_LIMITS = "SOFT_LIMITS", CONTRADICTIONS = "CONTRADICTIONS", DISTRIBUTION = "DISTRIBUTION", DECIMALS = "DECIMALS", VARIABLE_ROLE = "VARIABLE_ROLE", DATA_ENTRY_TYPE = "DATA_ENTRY_TYPE", CO_VARS = "CO_VARS", END_DIGIT_CHECK = "END_DIGIT_CHECK", VARIABLE_ORDER = "VARIABLE_ORDER", LONG_LABEL = "LONG_LABEL", recode = "recode", MISSING_LIST_TABLE = "MISSING_LIST_TABLE" ), OPTIONAL ) # Next one will fail try( prep_check_meta_names(data.frame(VAR_NAMES = 1, DATA_TYPE = 2, MISSING_LIST = 3), TECHNICAL) ) ## End(Not run)## Not run: prep_check_meta_names(data.frame(VAR_NAMES = 1, DATA_TYPE = 2, MISSING_LIST = 3)) prep_check_meta_names( data.frame( VAR_NAMES = 1, DATA_TYPE = 2, MISSING_LIST = 3, LABEL = "LABEL", VALUE_LABELS = "VALUE_LABELS", JUMP_LIST = "JUMP_LIST", HARD_LIMITS = "HARD_LIMITS", GROUP_VAR_OBSERVER = "GROUP_VAR_OBSERVER", GROUP_VAR_DEVICE = "GROUP_VAR_DEVICE", TIME_VAR = "TIME_VAR", PART_VAR = "PART_VAR", STUDY_SEGMENT = "STUDY_SEGMENT", LOCATION_RANGE = "LOCATION_RANGE", LOCATION_METRIC = "LOCATION_METRIC", PROPORTION_RANGE = "PROPORTION_RANGE", MISSING_LIST_TABLE = "MISSING_LIST_TABLE", CO_VARS = "CO_VARS", LONG_LABEL = "LONG_LABEL" ), RECOMMENDED ) prep_check_meta_names( data.frame( VAR_NAMES = 1, DATA_TYPE = 2, MISSING_LIST = 3, LABEL = "LABEL", VALUE_LABELS = "VALUE_LABELS", JUMP_LIST = "JUMP_LIST", HARD_LIMITS = "HARD_LIMITS", GROUP_VAR_OBSERVER = "GROUP_VAR_OBSERVER", GROUP_VAR_DEVICE = "GROUP_VAR_DEVICE", TIME_VAR = "TIME_VAR", PART_VAR = "PART_VAR", STUDY_SEGMENT = "STUDY_SEGMENT", LOCATION_RANGE = "LOCATION_RANGE", LOCATION_METRIC = "LOCATION_METRIC", PROPORTION_RANGE = "PROPORTION_RANGE", DETECTION_LIMITS = "DETECTION_LIMITS", SOFT_LIMITS = "SOFT_LIMITS", CONTRADICTIONS = "CONTRADICTIONS", DISTRIBUTION = "DISTRIBUTION", DECIMALS = "DECIMALS", VARIABLE_ROLE = "VARIABLE_ROLE", DATA_ENTRY_TYPE = "DATA_ENTRY_TYPE", CO_VARS = "CO_VARS", END_DIGIT_CHECK = "END_DIGIT_CHECK", VARIABLE_ORDER = "VARIABLE_ORDER", LONG_LABEL = "LONG_LABEL", recode = "recode", MISSING_LIST_TABLE = "MISSING_LIST_TABLE" ), OPTIONAL ) # Next one will fail try( prep_check_meta_names(data.frame(VAR_NAMES = 1, DATA_TYPE = 2, MISSING_LIST = 3), TECHNICAL) ) ## End(Not run)
Adjust labels in meta_data to be valid variable names in formulas for
diverse r functions, such as glm or lme4::lmer.
prep_clean_labels( label_col, item_level = "item_level", no_dups = FALSE, meta_data = item_level, meta_data_v2 )prep_clean_labels( label_col, item_level = "item_level", no_dups = FALSE, meta_data = item_level, meta_data_v2 )
label_col |
character label attribute to adjust or character vector to
adjust, depending on |
item_level |
data.frame metadata data frame: If |
no_dups |
logical disallow duplicates in input or output vectors of
the function, then, prep_clean_labels would call
|
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
Hint: The following is still true, but the functions should be capable of doing potentially needed fixes on-the-fly automatically, so likely you will not need this function any more.
Currently, labels as given by label_col arguments in the most functions
are directly used in formula, so that they become natural part of the
outputs, but different models expect differently strict syntax for such
formulas, especially for valid variable names. prep_clean_labels removes
all potentially inadmissible characters from variable names (no guarantee,
that some exotic model still rejects the names, but minimizing the number
of exotic characters). However, variable names are modified, may become
unreadable or indistinguishable from other variable names. For the latter
case, a stop call is possible, controlled by the no_dups argument.
A warning is emitted, if modifications were necessary.
a data.frame with:
if meta_data is set, a list with:
modified meta_data[, label_col] column
if meta_data is not set, adjusted labels that then were directly given
in label_col
## Not run: meta_data1 <- data.frame( LABEL = c( "syst. Blood pressure (mmHg) 1", "1st heart frequency in MHz", "body surface (\\u33A1)" ) ) print(meta_data1) print(prep_clean_labels(meta_data1$LABEL)) meta_data1 <- prep_clean_labels("LABEL", meta_data1) print(meta_data1) ## End(Not run)## Not run: meta_data1 <- data.frame( LABEL = c( "syst. Blood pressure (mmHg) 1", "1st heart frequency in MHz", "body surface (\\u33A1)" ) ) print(meta_data1) print(prep_clean_labels(meta_data1$LABEL)) meta_data1 <- prep_clean_labels("LABEL", meta_data1) print(meta_data1) ## End(Not run)
Combine two report summaries
prep_combine_report_summaries(..., summaries_list, amend_segment_names = FALSE)prep_combine_report_summaries(..., summaries_list, amend_segment_names = FALSE)
... |
objects returned by prep_extract_summary |
summaries_list |
if given, list of objects returned by prep_extract_summary |
amend_segment_names |
logical use names of the |
combined summaries
Other summary_functions:
prep_extract_classes_by_functions(),
prep_extract_summary(),
prep_extract_summary.dataquieR_result(),
prep_extract_summary.dataquieR_resultset2(),
prep_render_pie_chart_from_summaryclasses_ggplot2(),
prep_render_pie_chart_from_summaryclasses_plotly(),
prep_summary_to_classes()
are the provided item-level meta_data plausible given study_data?
prep_compare_meta_with_study( study_data, label_col, item_level = "item_level", meta_data = item_level, meta_data_v2 )prep_compare_meta_with_study( study_data, label_col, item_level = "item_level", meta_data = item_level, meta_data_v2 )
study_data |
data.frame the data frame that contains the measurements |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
item_level |
data.frame the data frame that contains metadata attributes of study data |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
an invisible() list with the entries.
pred data.frame metadata predicted from study_data,
reduced to such metadata also available in the provided metadata
prov data.frame provided metadata,
reduced to such metadata also available in the provided study_data
ml_error character VAR_NAMES of variables with potentially wrong
MISSING_LIST
sl_error character VAR_NAMES of variables with potentially wrong
SCALE_LEVEL
dt_error character VAR_NAMES of variables with potentially wrong
DATA_TYPE
Create a metadata data frame and map names.
Generally, this function only creates a data.frame, but using
this constructor instead of calling
data.frame(..., stringsAsFactors = FALSE), it becomes possible, to adapt
the metadata data.frame in later developments, e.g. if we decide to use
classes for the metadata, or if certain standard names of variable attributes
change. Also, a validity check is possible to implement here.
prep_create_meta(..., stringsAsFactors = FALSE, level, character.only = FALSE)prep_create_meta(..., stringsAsFactors = FALSE, level, character.only = FALSE)
... |
named column vectors, names will be mapped using WELL_KNOWN_META_VARIABLE_NAMES, if included in WELL_KNOWN_META_VARIABLE_NAMES can also be a data frame, then its column names will be mapped using WELL_KNOWN_META_VARIABLE_NAMES |
stringsAsFactors |
logical if the argument is a list of vectors, a
data frame will be
created. In this case, |
level |
enum level of requirement (see also VARATT_REQUIRE_LEVELS)
set to |
character.only |
logical a logical indicating whether level can be assumed to be character strings. |
For now, this calls data.frame, but it already renames variable attributes,
if they have a different name assigned in WELL_KNOWN_META_VARIABLE_NAMES,
e.g. WELL_KNOWN_META_VARIABLE_NAMES$RECODE maps to recode in lower case.
NB: dataquieR exports all names from WELL_KNOWN_META_VARIABLE_NAME as
symbols, so RECODE also contains "recode".
a data frame with:
metadata attribute names mapped and
metadata checked using prep_check_meta_names and do some more verification about conventions, such as check for valid intervals in limits)
WELL_KNOWN_META_VARIABLE_NAMES
Instantiate a new metadata file
prep_create_meta_data_file( file_name, study_data, open = TRUE, overwrite = FALSE )prep_create_meta_data_file( file_name, study_data, open = TRUE, overwrite = FALSE )
file_name |
character file path to write to |
study_data |
data.frame optional, study data to guess metadata from |
open |
logical open the file after creation |
overwrite |
logical overwrite |
invisible(NULL)
storr objects for backing
a dataquieR_resultset2
Create a factory function for storr objects for backing
a dataquieR_resultset2
prep_create_storr_factory(db_dir = tempfile(), namespace = "objects")prep_create_storr_factory(db_dir = tempfile(), namespace = "objects")
db_dir |
character path to the directory for the back-end, if one is created on the fly. |
namespace |
character namespace for the report, so that one back-end can back several reports the returned function will try to create a |
storr object or NULL, if package storr is not available
Get data types from data
prep_datatype_from_data( resp_vars = colnames(study_data), study_data, .dont_cast_off_cols = FALSE, guess_character = getOption("dataquieR.guess_character", default = dataquieR.guess_character_default) )prep_datatype_from_data( resp_vars = colnames(study_data), study_data, .dont_cast_off_cols = FALSE, guess_character = getOption("dataquieR.guess_character", default = dataquieR.guess_character_default) )
resp_vars |
variable names of the variables to fetch the data type from the data |
study_data |
data.frame the data frame that contains the measurements Hint: Only data frames supported, no URL or file names. |
.dont_cast_off_cols |
logical internal use, only |
guess_character |
logical guess a data type for character columns based on the values |
vector of data types
## Not run: dataquieR::prep_datatype_from_data(cars) ## End(Not run)## Not run: dataquieR::prep_datatype_from_data(cars) ## End(Not run)
Convert two vectors from a code-value-table to a key-value list
prep_deparse_assignments( codes, labels = codes, split_char = SPLIT_CHAR, mode = c("numeric_codes", "string_codes") )prep_deparse_assignments( codes, labels = codes, split_char = SPLIT_CHAR, mode = c("numeric_codes", "string_codes") )
codes |
codes, numeric or dates (as default, but string codes can be enabled using the option 'mode', see below) |
labels |
character labels, same length as codes |
split_char |
character split character character to split code assignments |
mode |
character one of two options to insist on numeric or datetime codes (default) or to allow for string codes |
a vector with assignment strings for each row of
cbind(codes, labels)
De-register a hook function for progresses in computation/rendering
prep_deregister_progress_hook(handle, verbose = TRUE)prep_deregister_progress_hook(handle, verbose = TRUE)
handle |
character the handle |
verbose |
logical message, if |
logical invisible(TRUE) on success
DATA_TYPE of x
Get the dataquieR DATA_TYPE of x
prep_dq_data_type_of( x, guess_character = getOption("dataquieR.guess_character", default = dataquieR.guess_character_default) )prep_dq_data_type_of( x, guess_character = getOption("dataquieR.guess_character", default = dataquieR.guess_character_default) )
x |
object to define the dataquieR data type of |
guess_character |
logical guess a data type for character columns based on the values |
the dataquieR data type as listed in DATA_TYPES
Code labels are copied from other variables, if the code is the same and the label is set only for some variables
prep_expand_codes( item_level = "item_level", suppressWarnings = FALSE, mix_jumps_and_missings = FALSE, meta_data_v2, meta_data = item_level )prep_expand_codes( item_level = "item_level", suppressWarnings = FALSE, mix_jumps_and_missings = FALSE, meta_data_v2, meta_data = item_level )
item_level |
data.frame the data frame that contains metadata attributes of study data |
suppressWarnings |
logical show warnings, if labels are expanded |
mix_jumps_and_missings |
logical ignore the class of the codes for label expansion, i.e., use missing code labels as jump code labels, if the values are the same. |
meta_data_v2 |
character path to workbook like metadata file, see
|
meta_data |
data.frame old name for |
data.frame an updated metadata data frame.
## Not run: meta_data <- prep_get_data_frame("meta_data") meta_data$JUMP_LIST[meta_data$VAR_NAMES == "v00003"] <- "99980 = NOOP" md <- prep_expand_codes(meta_data) md$JUMP_LIST md$MISSING_LIST md <- prep_expand_codes(meta_data, mix_jumps_and_missings = TRUE) md$JUMP_LIST md$MISSING_LIST meta_data <- prep_get_data_frame("meta_data") meta_data$MISSING_LIST[meta_data$VAR_NAMES == "v00003"] <- "99980 = NOOP" md <- prep_expand_codes(meta_data) md$JUMP_LIST md$MISSING_LIST ## End(Not run)## Not run: meta_data <- prep_get_data_frame("meta_data") meta_data$JUMP_LIST[meta_data$VAR_NAMES == "v00003"] <- "99980 = NOOP" md <- prep_expand_codes(meta_data) md$JUMP_LIST md$MISSING_LIST md <- prep_expand_codes(meta_data, mix_jumps_and_missings = TRUE) md$JUMP_LIST md$MISSING_LIST meta_data <- prep_get_data_frame("meta_data") meta_data$MISSING_LIST[meta_data$VAR_NAMES == "v00003"] <- "99980 = NOOP" md <- prep_expand_codes(meta_data) md$JUMP_LIST md$MISSING_LIST ## End(Not run)
Extract all missing/jump codes from metadata and export a cause-label-data-frame
prep_extract_cause_label_df( item_level = "item_level", label_col = VAR_NAMES, meta_data_v2, meta_data = item_level )prep_extract_cause_label_df( item_level = "item_level", label_col = VAR_NAMES, meta_data_v2, meta_data = item_level )
item_level |
data.frame the data frame that contains metadata attributes of study data |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
meta_data_v2 |
character path to workbook like metadata file, see
|
meta_data |
data.frame old name for |
list with the entries
meta_data data.frame a data frame that contains updated metadata –
you still need to add a column
MISSING_LIST_TABLE and add the
cause_label_df as such to the metadata
cache using prep_add_data_frames(), manually.
cause_label_df data.frame missing code table. If missing codes have
labels the respective data frame are
specified here, see cause_label_df.
Extract old function based summary from data quality results
prep_extract_classes_by_functions(r)prep_extract_classes_by_functions(r)
r |
data.frame long format, compatible with prep_summary_to_classes()
Other summary_functions:
prep_combine_report_summaries(),
prep_extract_summary(),
prep_extract_summary.dataquieR_result(),
prep_extract_summary.dataquieR_resultset2(),
prep_render_pie_chart_from_summaryclasses_ggplot2(),
prep_render_pie_chart_from_summaryclasses_plotly(),
prep_summary_to_classes()
Generic function, currently supports dq_report2 and dataquieR_result
prep_extract_summary(r, ...)prep_extract_summary(r, ...)
r |
dq_report2 or dataquieR_result object |
... |
further arguments, maybe needed for some implementations |
list with two slots Data and Table with data.frames
featuring all metrics columns
from the report or result in x,
the STUDY_SEGMENT and the VAR_NAMES.
In case of Data, the columns are formatted nicely but still
with the standardized column names – use
util_translate_indicator_metrics() to rename them nicely. In
case of Table, just as they are.
Other summary_functions:
prep_combine_report_summaries(),
prep_extract_classes_by_functions(),
prep_extract_summary.dataquieR_result(),
prep_extract_summary.dataquieR_resultset2(),
prep_render_pie_chart_from_summaryclasses_ggplot2(),
prep_render_pie_chart_from_summaryclasses_plotly(),
prep_summary_to_classes()
Extract report summary from reports
## S3 method for class 'dataquieR_result' prep_extract_summary(r, ...)## S3 method for class 'dataquieR_result' prep_extract_summary(r, ...)
r |
dataquieR_result a result from adq_report2 report |
... |
not used |
list with two slots Data and Table with data.frames
featuring all metrics columns
from the report r, the STUDY_SEGMENT and the VAR_NAMES.
In case of Data, the columns are formatted nicely but still
with the standardized column names – use
util_translate_indicator_metrics() to rename them nicely. In
case of Table, just as they are.
prep_combine_report_summaries()
Other summary_functions:
prep_combine_report_summaries(),
prep_extract_classes_by_functions(),
prep_extract_summary(),
prep_extract_summary.dataquieR_resultset2(),
prep_render_pie_chart_from_summaryclasses_ggplot2(),
prep_render_pie_chart_from_summaryclasses_plotly(),
prep_summary_to_classes()
Extract report summary from reports
## S3 method for class 'dataquieR_resultset2' prep_extract_summary(r, ...)## S3 method for class 'dataquieR_resultset2' prep_extract_summary(r, ...)
r |
dq_report2 a dq_report2 report |
... |
not used |
list with two slots Data and Table with data.frames
featuring all metrics columns
from the report r, the STUDY_SEGMENT and the VAR_NAMES.
In case of Data, the columns are formatted nicely but still
with the standardized column names – use
util_translate_indicator_metrics() to rename them nicely. In
case of Table, just as they are.
prep_combine_report_summaries()
Other summary_functions:
prep_combine_report_summaries(),
prep_extract_classes_by_functions(),
prep_extract_summary(),
prep_extract_summary.dataquieR_result(),
prep_render_pie_chart_from_summaryclasses_ggplot2(),
prep_render_pie_chart_from_summaryclasses_plotly(),
prep_summary_to_classes()
if VAR_NAMES have duplicates, maybe, it's because of ID-vars assigned
to different study segments multiple times (they should be in one "intro"-
segment, only), which is not the intended use of STUDY_SEGMENT.
Naturally, they will be part of more than one data-frame, so
this would also qualify for a dump duplicate, only, which can safely be
removed. Only ID-vars are by default assumed to have such duplicates in item
level metadata allowed.
prep_fix_meta_id_dups( meta_data_segment = "segment_level", meta_data_dataframe = "dataframe_level", item_level = "item_level", meta_data = item_level, meta_data_v2, segment_level, dataframe_level )prep_fix_meta_id_dups( meta_data_segment = "segment_level", meta_data_dataframe = "dataframe_level", item_level = "item_level", meta_data = item_level, meta_data_v2, segment_level, dataframe_level )
meta_data_segment |
data.frame – optional: Segment level metadata |
meta_data_dataframe |
data.frame the data frame that contains the metadata for the data frame level |
item_level |
data.frame the data frame that contains metadata attributes of study data |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
segment_level |
data.frame alias for |
dataframe_level |
data.frame alias for |
## Not run: il <- prep_get_data_frame("item_level") il <- rbind(il, il) il$STUDY_SEGMENT[2] <- "X" il2 <- prep_fix_meta_id_dups(meta_data_v2 = "meta_data_v2", item_level = il) il2$STUDY_SEGMENT il$STUDY_SEGMENT[3] <- "X" il3 <- prep_fix_meta_id_dups(meta_data_v2 = "meta_data_v2", item_level = il) il3$STUDY_SEGMENT ## End(Not run)## Not run: il <- prep_get_data_frame("item_level") il <- rbind(il, il) il$STUDY_SEGMENT[2] <- "X" il2 <- prep_fix_meta_id_dups(meta_data_v2 = "meta_data_v2", item_level = il) il2$STUDY_SEGMENT il$STUDY_SEGMENT[3] <- "X" il3 <- prep_fix_meta_id_dups(meta_data_v2 = "meta_data_v2", item_level = il) il3$STUDY_SEGMENT ## End(Not run)
data_frame_name can be a file path or an URL you can append a pipe and a
sheet name for Excel files or object name e.g. for RData files. Numbers
may also work. All file formats supported by your rio installation will
work.
prep_get_data_frame( data_frame_name, .data_frame_list = .dataframe_environment(), keep_types = FALSE, column_names_only = FALSE )prep_get_data_frame( data_frame_name, .data_frame_list = .dataframe_environment(), keep_types = FALSE, column_names_only = FALSE )
data_frame_name |
character name of the data frame to read, see details |
.data_frame_list |
environment cache for loaded data frames |
keep_types |
logical keep types as possibly defined in a file, if the
data frame is loaded from one. set |
column_names_only |
logical if TRUE imports only headers (column names) of the data frame and no content (an empty data frame) |
The data frames will be cached automatically, you can define an alternative
environment for this using the argument .data_frame_list, and you can purge
the cache using prep_purge_data_frame_cache.
Use prep_add_data_frames to manually add data frames to the cache, e.g., if you have loaded them from more complex sources, before.
data.frame a data frame
Other data-frame-cache:
prep_add_data_frames(),
prep_list_dataframes(),
prep_load_folder_with_metadata(),
prep_load_workbook_like_file(),
prep_purge_data_frame_cache(),
prep_remove_from_cache()
## Not run: bl <- as.factor(prep_get_data_frame( paste0("https://www.rki.de/DE/Content/InfAZ/N/Neuartiges_Coronavirus", "/Projekte_RKI/COVID-19_Todesfaelle.xlsx?__blob=", "publicationFile|COVID_Todesfälle_BL|Bundesland"))[[1]]) n <- as.numeric(prep_get_data_frame(paste0( "https://www.rki.de/DE/Content/InfAZ/N/Neuartiges_Coronavirus/", "Projekte_RKI/COVID-19_Todesfaelle.xlsx?__blob=", "publicationFile|COVID_Todesfälle_BL|Anzahl verstorbene", " COVID-19 Fälle"))[[1]]) plot(bl, n) # Working names would be to date (2022-10-21), e.g.: # # https://www.rki.de/DE/Content/InfAZ/N/Neuartiges_Coronavirus/ \ # Projekte_RKI/COVID-19_Todesfaelle.xlsx?__blob=publicationFile # https://www.rki.de/DE/Content/InfAZ/N/Neuartiges_Coronavirus/ \ # Projekte_RKI/COVID-19_Todesfaelle.xlsx?__blob=publicationFile|2 # https://www.rki.de/DE/Content/InfAZ/N/Neuartiges_Coronavirus/ \ # Projekte_RKI/COVID-19_Todesfaelle.xlsx?__blob=publicationFile|name # study_data # ship # meta_data # ship_meta # prep_get_data_frame("meta_data | meta_data") ## End(Not run)## Not run: bl <- as.factor(prep_get_data_frame( paste0("https://www.rki.de/DE/Content/InfAZ/N/Neuartiges_Coronavirus", "/Projekte_RKI/COVID-19_Todesfaelle.xlsx?__blob=", "publicationFile|COVID_Todesfälle_BL|Bundesland"))[[1]]) n <- as.numeric(prep_get_data_frame(paste0( "https://www.rki.de/DE/Content/InfAZ/N/Neuartiges_Coronavirus/", "Projekte_RKI/COVID-19_Todesfaelle.xlsx?__blob=", "publicationFile|COVID_Todesfälle_BL|Anzahl verstorbene", " COVID-19 Fälle"))[[1]]) plot(bl, n) # Working names would be to date (2022-10-21), e.g.: # # https://www.rki.de/DE/Content/InfAZ/N/Neuartiges_Coronavirus/ \ # Projekte_RKI/COVID-19_Todesfaelle.xlsx?__blob=publicationFile # https://www.rki.de/DE/Content/InfAZ/N/Neuartiges_Coronavirus/ \ # Projekte_RKI/COVID-19_Todesfaelle.xlsx?__blob=publicationFile|2 # https://www.rki.de/DE/Content/InfAZ/N/Neuartiges_Coronavirus/ \ # Projekte_RKI/COVID-19_Todesfaelle.xlsx?__blob=publicationFile|name # study_data # ship # meta_data # ship_meta # prep_get_data_frame("meta_data | meta_data") ## End(Not run)
Fetch a label for a variable based on its purpose
prep_get_labels( resp_vars, item_level = "item_level", label_col, max_len, label_class = c("SHORT", "LONG"), label_lang = getOption("dataquieR.lang", dataquieR.lang_default), resp_vars_are_var_names_only = FALSE, resp_vars_match_label_col_only = FALSE, meta_data = item_level, meta_data_v2, force_label_col = getOption("dataquieR.force_label_col", dataquieR.force_label_col_default) )prep_get_labels( resp_vars, item_level = "item_level", label_col, max_len, label_class = c("SHORT", "LONG"), label_lang = getOption("dataquieR.lang", dataquieR.lang_default), resp_vars_are_var_names_only = FALSE, resp_vars_match_label_col_only = FALSE, meta_data = item_level, meta_data_v2, force_label_col = getOption("dataquieR.force_label_col", dataquieR.force_label_col_default) )
resp_vars |
variable list the variable names to fetch for |
item_level |
data.frame the data frame that contains metadata attributes of study data |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
max_len |
integer the maximum label length to return, if not possible
w/o causing ambiguous labels, the labels may still
be longer. For |
label_class |
enum SHORT | LONG. which sort of label according to the metadata model should be returned |
label_lang |
character optional language suffix, if available in
the metadata. Can be controlled by the option
|
resp_vars_are_var_names_only |
logical If |
resp_vars_match_label_col_only |
logical If |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
force_label_col |
enum auto | FALSE | TRUE. if |
character suitable labels for each resp_vars, names of this
vector are VAR_NAMES
## Not run: prep_load_workbook_like_file("meta_data_v2") prep_get_labels("SEX_0", label_class = "SHORT", max_len = 2) ## End(Not run)## Not run: prep_load_workbook_like_file("meta_data_v2") prep_get_labels("SEX_0", label_class = "SHORT", max_len = 2) ## End(Not run)
Get data frame for a given segment
prep_get_study_data_segment( segment, study_data, item_level = "item_level", meta_data = item_level, meta_data_v2, segment_level, meta_data_segment = "segment_level" )prep_get_study_data_segment( segment, study_data, item_level = "item_level", meta_data = item_level, meta_data_v2, segment_level, meta_data_segment = "segment_level" )
segment |
character name of the segment to return data for |
study_data |
data.frame the data frame that contains the measurements |
item_level |
data.frame the data frame that contains metadata attributes of study data |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
segment_level |
data.frame alias for |
meta_data_segment |
data.frame – optional: Segment level metadata |
data.frame the data for the segment
If whoami is not installed, the user name from
Sys.info() is returned.
prep_get_user_name()prep_get_user_name()
Can be overridden by options or environment:
options(FULLNAME = "Stephan Struckmann")
Sys.setenv(FULLNAME = "Stephan Struckmann")
character the user's name
Guess encoding of text or text files
prep_guess_encoding(x, file)prep_guess_encoding(x, file)
x |
character string to guess encoding for |
file |
character file to guess encoding for |
encoding
dataquieR_translated objectDetect if an object is a dataquieR_translated object
prep_is_translated(x)prep_is_translated(x)
x |
the object to test |
TRUE, if x is a dataquieR_translated object.
RMD filesPrepare a label as part of a link for RMD files
prep_link_escape(s, html = FALSE)prep_link_escape(s, html = FALSE)
s |
the label |
html |
prepare the label for direct |
the escaped label
List Loaded Data Frames
prep_list_dataframes()prep_list_dataframes()
names of all loaded data frames
Other data-frame-cache:
prep_add_data_frames(),
prep_get_data_frame(),
prep_load_folder_with_metadata(),
prep_load_workbook_like_file(),
prep_purge_data_frame_cache(),
prep_remove_from_cache()
voc: vocabulariesAll valid voc: vocabularies
prep_list_voc()prep_list_voc()
character() all voc: suffixes allowed for
prep_get_data_frame().
## Not run: prep_list_dataframes() prep_list_voc() prep_get_data_frame("<ICD10>") my_voc <- tibble::tribble( ~ voc, ~ url, "test", "data:datasets|iris|Species+Sepal.Length") prep_add_data_frames(`<>` = my_voc) prep_list_dataframes() prep_list_voc() prep_get_data_frame("<test>") prep_get_data_frame("<ICD10>") my_voc <- tibble::tribble( ~ voc, ~ url, "ICD10", "data:datasets|iris|Species+Sepal.Length") prep_add_data_frames(`<>` = my_voc) prep_list_dataframes() prep_list_voc() prep_get_data_frame("<ICD10>") ## End(Not run)## Not run: prep_list_dataframes() prep_list_voc() prep_get_data_frame("<ICD10>") my_voc <- tibble::tribble( ~ voc, ~ url, "test", "data:datasets|iris|Species+Sepal.Length") prep_add_data_frames(`<>` = my_voc) prep_list_dataframes() prep_list_voc() prep_get_data_frame("<test>") prep_get_data_frame("<ICD10>") my_voc <- tibble::tribble( ~ voc, ~ url, "ICD10", "data:datasets|iris|Species+Sepal.Length") prep_add_data_frames(`<>` = my_voc) prep_list_dataframes() prep_list_voc() prep_get_data_frame("<ICD10>") ## End(Not run)
The original purpose of this function is to load metadata, not study data.
If you want to load study data, you should keep them in a different folder,
then you can call this function once for the metadata and once for the study
data but this time setting keep_types = TRUE to avoid all data being read
as character().
prep_load_folder_with_metadata(folder, keep_types = FALSE, append = FALSE, ...)prep_load_folder_with_metadata(folder, keep_types = FALSE, append = FALSE, ...)
folder |
the folder name to load. |
keep_types |
logical keep types as possibly defined in the file.
set |
append |
logical if a data frame already exists in the cache (by name), extend the existing one |
... |
arguments passed to |
Note, that once loaded to the data frame cache, a file won't be read again,
except you call prep_purge_data_frame_cache() or
prep_remove_from_cache(). That is, if you call this function first, and
prep_get_data_frame() later, of if dataquieR wants to read a file, e.g.,
for dq_report2(), the file will come from the cache in the way it was
initially read in (keep_types may thus be used inadequately).
By default, this function works not recursively, but you can tweak that by
passing ...-arguments passed through to the initially running
list.files() function.
These can thereafter be referred to by their names only. Such files are,
e.g., spreadsheet-workbooks or RData-files.
Note, that this function in contrast to prep_get_data_frame does neither support selecting specific sheets/columns from a file.
invisible(the cache environment)
Other data-frame-cache:
prep_add_data_frames(),
prep_get_data_frame(),
prep_list_dataframes(),
prep_load_workbook_like_file(),
prep_purge_data_frame_cache(),
prep_remove_from_cache()
dq_report2
Load a dq_report2
prep_load_report(file)prep_load_report(file)
file |
character the file name to load from |
dataquieR_resultset2 the report
Load a report from a back-end
prep_load_report_from_backend( namespace = "objects", db_dir, storr_factory = prep_create_storr_factory(namespace = namespace, db_dir = db_dir) )prep_load_report_from_backend( namespace = "objects", db_dir, storr_factory = prep_create_storr_factory(namespace = namespace, db_dir = db_dir) )
namespace |
the namespace to read the report's results from |
db_dir |
character path to the directory for the back-end, if
a |
storr_factory |
a function returning a |
dataquieR_resultset2 the report
## Not run: r <- dataquieR::dq_report2("study_data", meta_data_v2 = "meta_data_v2", dimensions = NULL) storr_factory <- prep_create_storr_factory() r_storr <- prep_set_backend(r, storr_factory) r_restorr <- prep_set_backend(r_storr, NULL) r_loaded <- prep_load_report_from_backend(storr_factory) ## End(Not run)## Not run: r <- dataquieR::dq_report2("study_data", meta_data_v2 = "meta_data_v2", dimensions = NULL) storr_factory <- prep_create_storr_factory() r_storr <- prep_set_backend(r, storr_factory) r_restorr <- prep_set_backend(r_storr, NULL) r_loaded <- prep_load_report_from_backend(storr_factory) ## End(Not run)
These can thereafter be referred to by their names only. Such files are,
e.g., spreadsheet-workbooks or RData-files.
prep_load_workbook_like_file(file, keep_types = FALSE, append = FALSE)prep_load_workbook_like_file(file, keep_types = FALSE, append = FALSE)
file |
the file name to load. |
keep_types |
logical keep types as possibly defined in the file.
set |
append |
logical if a data frame already exists in the cache (by name), extend the existing one |
Note, that this function in contrast to prep_get_data_frame does neither support selecting specific sheets/columns from a file.
invisible(the cache environment)
Other data-frame-cache:
prep_add_data_frames(),
prep_get_data_frame(),
prep_list_dataframes(),
prep_load_folder_with_metadata(),
prep_purge_data_frame_cache(),
prep_remove_from_cache()
Map variables to certain attributes, e.g. by default their labels.
prep_map_labels( x, item_level = "item_level", to = LABEL, from = VAR_NAMES, ifnotfound, warn_ambiguous = FALSE, meta_data_v2, meta_data = item_level )prep_map_labels( x, item_level = "item_level", to = LABEL, from = VAR_NAMES, ifnotfound, warn_ambiguous = FALSE, meta_data_v2, meta_data = item_level )
x |
character variable names, character vector, see parameter from |
item_level |
data.frame metadata data frame, if, as a |
to |
character variable attribute to map to |
from |
character variable identifier to map from |
ifnotfound |
list A list of values to be used if the item is not found: it will be coerced to a list if necessary. |
warn_ambiguous |
logical print a warning if mapping variables from
|
meta_data_v2 |
character path to workbook like metadata file, see
|
meta_data |
data.frame old name for |
This function basically calls colnames(study_data) <- meta_data$LABEL,
ensuring correct merging/joining of study data columns to the corresponding
metadata rows, even if the orders differ. If a variable/study_data-column
name is not found in meta_data[[from]] (default from = VAR_NAMES),
either stop is called or, if ifnotfound has been assigned a value, that
value is returned. See mget, which is internally used by this function.
The function not only maps to the LABEL column, but to can be any
metadata variable attribute, so the function can also be used, to get, e.g.
all HARD_LIMITS from the metadata.
a character vector with:
mapped values
## Not run: meta_data <- prep_create_meta( VAR_NAMES = c("ID", "SEX", "AGE", "DOE"), LABEL = c("Pseudo-ID", "Gender", "Age", "Examination Date"), DATA_TYPE = c(DATA_TYPES$INTEGER, DATA_TYPES$INTEGER, DATA_TYPES$INTEGER, DATA_TYPES$DATETIME), MISSING_LIST = "" ) stopifnot(all(prep_map_labels(c("AGE", "DOE"), meta_data) == c("Age", "Examination Date"))) ## End(Not run)## Not run: meta_data <- prep_create_meta( VAR_NAMES = c("ID", "SEX", "AGE", "DOE"), LABEL = c("Pseudo-ID", "Gender", "Age", "Examination Date"), DATA_TYPE = c(DATA_TYPES$INTEGER, DATA_TYPES$INTEGER, DATA_TYPES$INTEGER, DATA_TYPES$DATETIME), MISSING_LIST = "" ) stopifnot(all(prep_map_labels(c("AGE", "DOE"), meta_data) == c("Age", "Examination Date"))) ## End(Not run)
Merge a list of study data frames to one (sparse) study data frame
prep_merge_study_data(study_data_list)prep_merge_study_data(study_data_list)
study_data_list |
list the list |
This function is idempotent..
prep_meta_data_v1_to_item_level_meta_data( item_level = "item_level", verbose = TRUE, label_col = LABEL, cause_label_df, meta_data = item_level )prep_meta_data_v1_to_item_level_meta_data( item_level = "item_level", verbose = TRUE, label_col = LABEL, cause_label_df, meta_data = item_level )
item_level |
data.frame the old item-level-metadata |
verbose |
logical display all estimated decisions, defaults to |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
cause_label_df |
data.frame missing code table, see cause_label_df. Optional. If this argument is given, you can add missing code tables. |
meta_data |
data.frame old name for |
The options("dataquieR.force_item_specific_missing_codes") (default
FALSE) tells the system, to always fill in res_vars columns to the
MISSING_LIST_TABLE, even, if the column already exists, but is empty.
data.frame the updated metadata
utility function to subset data based on minimum number of observation per level
prep_min_obs_level(study_data, group_vars, min_obs_in_subgroup)prep_min_obs_level(study_data, group_vars, min_obs_in_subgroup)
study_data |
data.frame the data frame that contains the measurements |
group_vars |
variable list the name grouping variable |
min_obs_in_subgroup |
integer optional argument if a "group_var" is used. This argument specifies the minimum no. of observations that is required to include a subgroup (level) of the "group_var" in the analysis. Subgroups with less observations are excluded. The default is 30. |
This functions removes observations having fewer than min_obs_in_subgroup
distinct values in a group variable, e.g. blood pressure measurements
performed by an examiner having fewer than e.g. 50 measurements done. It
displays a warning, if samples/rows are removed and returns the modified
study data frame.
a data frame with:
a subsample of original data
Open a data frame in Excel
prep_open_in_excel(dfr)prep_open_in_excel(dfr)
dfr |
the data frame |
if the file cannot be read on function exit, NULL will be returned
potentially modified data frame after dialog was closed
pmap
parallel version of purrr::pmap
prep_pmap(.l, .f, ..., cores = 0)prep_pmap(.l, .f, ..., cores = 0)
.l |
data.frame with one call per line and one function argument per column |
.f |
|
... |
additional, static arguments for calling |
cores |
number of cpu cores to use or a (named) list with arguments for parallelMap::parallelStart or NULL, if parallel has already been started by the caller. Set to 0 to run without parallelization. |
list of results of the function calls
S Struckmann
purrr::pmap
This function ensures, that a data frame ds1 with suitable variable
names study_data and meta_data exist as base data.frames.
prep_prepare_dataframes( .study_data, .meta_data, .label_col, .replace_hard_limits, .replace_missings, .sm_code = NULL, .allow_empty = FALSE, .adjust_data_type = TRUE, .amend_scale_level = TRUE, .apply_factor_metadata = FALSE, .apply_factor_metadata_inadm = FALSE, .internal = rlang::env_inherits(rlang::caller_env(), parent.env(environment())) )prep_prepare_dataframes( .study_data, .meta_data, .label_col, .replace_hard_limits, .replace_missings, .sm_code = NULL, .allow_empty = FALSE, .adjust_data_type = TRUE, .amend_scale_level = TRUE, .apply_factor_metadata = FALSE, .apply_factor_metadata_inadm = FALSE, .internal = rlang::env_inherits(rlang::caller_env(), parent.env(environment())) )
.study_data |
if provided, use this data set as study_data |
.meta_data |
if provided, use this data set as meta_data |
.label_col |
if provided, use this as label_col |
.replace_hard_limits |
replace |
.replace_missings |
replace missing codes, defaults to |
.sm_code |
missing code for |
.allow_empty |
allow |
.adjust_data_type |
ensure that the data type of variables in the study data corresponds to their data type specified in the metadata |
.amend_scale_level |
ensure that |
.apply_factor_metadata |
logical convert categorical variables to labeled factors. |
.apply_factor_metadata_inadm |
logical convert categorical variables
to labeled factors keeping
inadmissible values. Implies, that
.apply_factor_metadata will be set
to |
.internal |
logical internally called, modify caller's environment. |
This function defines ds1 and modifies study_data and meta_data in the
environment of its caller (see eval.parent). It also defines or modifies
the object label_col in the calling environment. Almost all functions
exported by dataquieR call this function initially, so that aspects common
to all functions live here, e.g. testing, if an argument meta_data has been
given and features really a data.frame. It verifies the existence of
required metadata attributes (VARATT_REQUIRE_LEVELS). It can also replace
missing codes by NAs, and calls prep_study2meta to generate a minimum
set of metadata from the study data on the fly (should be amended, so
on-the-fly-calling is not recommended for an instructive use of dataquieR).
The function also detects tibbles, which are then converted to base-R
data.frames, which are expected by dataquieR.
If .internal is TRUE, differently from the other utility function that
work in their caller's environment, this function modifies objects in the
calling function's environment. It defines a new object ds1,
it modifies study_data and/or meta_data
and label_col.
ds1 the study data with mapped column names, invisible(), if
not .internal
acc_margins
## Not run: acc_test1 <- function(resp_variable, aux_variable, time_variable, co_variables, group_vars, study_data, meta_data) { prep_prepare_dataframes() invisible(ds1) } acc_test2 <- function(resp_variable, aux_variable, time_variable, co_variables, group_vars, study_data, meta_data, label_col) { ds1 <- prep_prepare_dataframes(study_data, meta_data) invisible(ds1) } environment(acc_test1) <- asNamespace("dataquieR") # perform this inside the package (not needed for functions that have been # integrated with the package already) environment(acc_test2) <- asNamespace("dataquieR") # perform this inside the package (not needed for functions that have been # integrated with the package already) acc_test3 <- function(resp_variable, aux_variable, time_variable, co_variables, group_vars, study_data, meta_data, label_col) { prep_prepare_dataframes() invisible(ds1) } acc_test4 <- function(resp_variable, aux_variable, time_variable, co_variables, group_vars, study_data, meta_data, label_col) { ds1 <- prep_prepare_dataframes(study_data, meta_data) invisible(ds1) } environment(acc_test3) <- asNamespace("dataquieR") # perform this inside the package (not needed for functions that have been # integrated with the package already) environment(acc_test4) <- asNamespace("dataquieR") # perform this inside the package (not needed for functions that have been # integrated with the package already) meta_data <- prep_get_data_frame("meta_data") study_data <- prep_get_data_frame("study_data") try(acc_test1()) try(acc_test2()) acc_test1(study_data = study_data) try(acc_test1(meta_data = meta_data)) try(acc_test2(study_data = 12, meta_data = meta_data)) print(head(acc_test1(study_data = study_data, meta_data = meta_data))) print(head(acc_test2(study_data = study_data, meta_data = meta_data))) print(head(acc_test3(study_data = study_data, meta_data = meta_data))) print(head(acc_test3(study_data = study_data, meta_data = meta_data, label_col = LABEL))) print(head(acc_test4(study_data = study_data, meta_data = meta_data))) print(head(acc_test4(study_data = study_data, meta_data = meta_data, label_col = LABEL))) try(acc_test2(study_data = NULL, meta_data = meta_data)) ## End(Not run)## Not run: acc_test1 <- function(resp_variable, aux_variable, time_variable, co_variables, group_vars, study_data, meta_data) { prep_prepare_dataframes() invisible(ds1) } acc_test2 <- function(resp_variable, aux_variable, time_variable, co_variables, group_vars, study_data, meta_data, label_col) { ds1 <- prep_prepare_dataframes(study_data, meta_data) invisible(ds1) } environment(acc_test1) <- asNamespace("dataquieR") # perform this inside the package (not needed for functions that have been # integrated with the package already) environment(acc_test2) <- asNamespace("dataquieR") # perform this inside the package (not needed for functions that have been # integrated with the package already) acc_test3 <- function(resp_variable, aux_variable, time_variable, co_variables, group_vars, study_data, meta_data, label_col) { prep_prepare_dataframes() invisible(ds1) } acc_test4 <- function(resp_variable, aux_variable, time_variable, co_variables, group_vars, study_data, meta_data, label_col) { ds1 <- prep_prepare_dataframes(study_data, meta_data) invisible(ds1) } environment(acc_test3) <- asNamespace("dataquieR") # perform this inside the package (not needed for functions that have been # integrated with the package already) environment(acc_test4) <- asNamespace("dataquieR") # perform this inside the package (not needed for functions that have been # integrated with the package already) meta_data <- prep_get_data_frame("meta_data") study_data <- prep_get_data_frame("study_data") try(acc_test1()) try(acc_test2()) acc_test1(study_data = study_data) try(acc_test1(meta_data = meta_data)) try(acc_test2(study_data = 12, meta_data = meta_data)) print(head(acc_test1(study_data = study_data, meta_data = meta_data))) print(head(acc_test2(study_data = study_data, meta_data = meta_data))) print(head(acc_test3(study_data = study_data, meta_data = meta_data))) print(head(acc_test3(study_data = study_data, meta_data = meta_data, label_col = LABEL))) print(head(acc_test4(study_data = study_data, meta_data = meta_data))) print(head(acc_test4(study_data = study_data, meta_data = meta_data, label_col = LABEL))) try(acc_test2(study_data = NULL, meta_data = meta_data)) ## End(Not run)
Clear data frame cache
prep_purge_data_frame_cache()prep_purge_data_frame_cache()
nothing
Other data-frame-cache:
prep_add_data_frames(),
prep_get_data_frame(),
prep_list_dataframes(),
prep_load_folder_with_metadata(),
prep_load_workbook_like_file(),
prep_remove_from_cache()
ggplot
Evaluate the stored expression in its lean environment and cache
the resulting ggplot object in the current R session, if enabled
using the option dataquieR.lazy_plots_cache.
prep_realize_ggplot(x)prep_realize_ggplot(x)
x |
a |
A ggplot object.
The order hooks are called is not defined.
prep_register_progress_hook(type = c("progress", "init", "msg"), hook)prep_register_progress_hook(type = c("progress", "init", "msg"), hook)
type |
character what event |
hook |
function hook function |
character a handle for de-registering, invisible
Remove a specified element from the data frame cache
prep_remove_from_cache(object_to_remove)prep_remove_from_cache(object_to_remove)
object_to_remove |
character name of the object to be removed as character string (quoted), or character vector containing the names of the objects to remove from the cache |
nothing
Other data-frame-cache:
prep_add_data_frames(),
prep_get_data_frame(),
prep_list_dataframes(),
prep_load_folder_with_metadata(),
prep_load_workbook_like_file(),
prep_purge_data_frame_cache()
## Not run: prep_load_workbook_like_file("meta_data_v2") #load metadata in the cache ls(.dataframe_environment()) #get the list of dataframes in the cache #remove cross-item_level from the cache prep_remove_from_cache("cross-item_level") #remove dataframe_level and expected_id from the cache prep_remove_from_cache(c("dataframe_level", "expected_id")) #remove missing_table and segment_level from the cache x<- c("missing_table", "segment_level") prep_remove_from_cache(x) ## End(Not run)## Not run: prep_load_workbook_like_file("meta_data_v2") #load metadata in the cache ls(.dataframe_environment()) #get the list of dataframes in the cache #remove cross-item_level from the cache prep_remove_from_cache("cross-item_level") #remove dataframe_level and expected_id from the cache prep_remove_from_cache(c("dataframe_level", "expected_id")) #remove missing_table and segment_level from the cache x<- c("missing_table", "segment_level") prep_remove_from_cache(x) ## End(Not run)
ggplot2 pie chartneeds htmltools
prep_render_pie_chart_from_summaryclasses_ggplot2( data, meta_data = "item_level" )prep_render_pie_chart_from_summaryclasses_ggplot2( data, meta_data = "item_level" )
data |
data as returned by |
meta_data |
a htmltools compatible object or NULL, if package is missing
Other summary_functions:
prep_combine_report_summaries(),
prep_extract_classes_by_functions(),
prep_extract_summary(),
prep_extract_summary.dataquieR_result(),
prep_extract_summary.dataquieR_resultset2(),
prep_render_pie_chart_from_summaryclasses_plotly(),
prep_summary_to_classes()
plotly pie chartCreate a plotly pie chart
prep_render_pie_chart_from_summaryclasses_plotly( data, meta_data = "item_level" )prep_render_pie_chart_from_summaryclasses_plotly( data, meta_data = "item_level" )
data |
data as returned by |
meta_data |
a htmltools compatible object
Other summary_functions:
prep_combine_report_summaries(),
prep_extract_classes_by_functions(),
prep_extract_summary(),
prep_extract_summary.dataquieR_result(),
prep_extract_summary.dataquieR_resultset2(),
prep_render_pie_chart_from_summaryclasses_ggplot2(),
prep_summary_to_classes()
Guess the data type of a vector
prep_robust_guess_data_type(x, k = 50, it = 200)prep_robust_guess_data_type(x, k = 50, it = 200)
x |
a vector with characters |
k |
numeric sample size, if less than |
it |
integer number of iterations when taking samples |
a guess of the data type of x. An attribute orig_type is also
attached to give the more detailed guess returned by readr::guess_parser().
This function takes x and tries to guess the data type of random subsets of
this vector using readr::guess_parser(). The RNG is initialized with a
constant, so the function stays deterministic. It does such sub-sample based
checks it times, the majority of the detected datatype determines the
guessed data type.
dq_report2
Save a dq_report2
prep_save_report(report, file, compression_level = 3)prep_save_report(report, file, compression_level = 3)
report |
dataquieR_resultset2 the report |
file |
character the file name to write to |
compression_level |
integer from=0 to=9. Compression level. 9 is very slow. |
invisible(NULL)
...if missing
prep_scalelevel_from_data_and_metadata( resp_vars = lifecycle::deprecated(), study_data, item_level = "item_level", label_col = LABEL, meta_data = item_level, meta_data_v2 )prep_scalelevel_from_data_and_metadata( resp_vars = lifecycle::deprecated(), study_data, item_level = "item_level", label_col = LABEL, meta_data = item_level, meta_data_v2 )
resp_vars |
variable list deprecated, the function always addresses all variables. |
study_data |
data.frame the data frame that contains the measurements |
item_level |
data.frame the data frame that contains metadata attributes of study data |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
meta_data |
data.frame old name for |
meta_data_v2 |
character path to workbook like metadata file, see
|
data.frame modified metadata
## Not run: prep_load_workbook_like_file("meta_data_v2") prep_scalelevel_from_data_and_metadata(study_data = "study_data") ## End(Not run)## Not run: prep_load_workbook_like_file("meta_data_v2") prep_scalelevel_from_data_and_metadata(study_data = "study_data") ## End(Not run)
with this function, you can move a report from/to a storr storage.
prep_set_backend(r, storr_factory = NULL, amend = FALSE)prep_set_backend(r, storr_factory = NULL, amend = FALSE)
r |
dataquieR_resultset2 the report |
storr_factory |
|
amend |
logical if there is already data in. |
dataquieR_resultset2 but now with the desired back-end
Guess a minimum metadata data frame from study data. Minimum required variable attributes are:
prep_study2meta( study_data, level = c(VARATT_REQUIRE_LEVELS$REQUIRED, VARATT_REQUIRE_LEVELS$RECOMMENDED), cumulative = TRUE, convert_factors = FALSE, guess_missing_codes = getOption("dataquieR.guess_missing_codes", dataquieR.guess_missing_codes_default), guess_character = getOption("dataquieR.guess_character", default = dataquieR.guess_character_default) )prep_study2meta( study_data, level = c(VARATT_REQUIRE_LEVELS$REQUIRED, VARATT_REQUIRE_LEVELS$RECOMMENDED), cumulative = TRUE, convert_factors = FALSE, guess_missing_codes = getOption("dataquieR.guess_missing_codes", dataquieR.guess_missing_codes_default), guess_character = getOption("dataquieR.guess_character", default = dataquieR.guess_character_default) )
study_data |
data.frame the data frame that contains the measurements |
level |
enum levels to provide (see also VARATT_REQUIRE_LEVELS) |
cumulative |
logical include attributes of all levels up to level |
convert_factors |
logical convert factor columns to coded integers. if selected, then also the study data will be updated and returned. |
guess_missing_codes |
logical try to guess missing codes from the data |
guess_character |
logical guess a data type for character columns based on the values |
dataquieR:::util_get_var_att_names_of_level(VARATT_REQUIRE_LEVELS$REQUIRED) #> VAR_NAMES DATA_TYPE MISSING_LIST_TABLE #> "VAR_NAMES" "DATA_TYPE" "MISSING_LIST_TABLE"
The function also tries to detect missing codes.
a meta_data data frame or a list with study data and metadata, if
convert_factors == TRUE.
## Not run: dataquieR::prep_study2meta(Orange, convert_factors = FALSE) ## End(Not run)## Not run: dataquieR::prep_study2meta(Orange, convert_factors = FALSE) ## End(Not run)
Classify metrics from a report summary table
prep_summary_to_classes(report_summary)prep_summary_to_classes(report_summary)
report_summary |
|
data.frame classes for the report summary table, long format
Other summary_functions:
prep_combine_report_summaries(),
prep_extract_classes_by_functions(),
prep_extract_summary(),
prep_extract_summary.dataquieR_result(),
prep_extract_summary.dataquieR_resultset2(),
prep_render_pie_chart_from_summaryclasses_ggplot2(),
prep_render_pie_chart_from_summaryclasses_plotly()
RMD filesPrepare a label as part of a title text for RMD files
prep_title_escape(s, html = FALSE)prep_title_escape(s, html = FALSE)
s |
the label |
html |
prepare the label for direct |
the escaped label
new function: no warranty, so far.
prep_undisclose(x, cores)prep_undisclose(x, cores)
x |
an object to un-disclose, a |
cores |
can be an integer with a number of cores to use. if not specified, the function uses the default cluster, if available and falls back to serial un-disclosing, otherwise. |
undisclosed object
Combine all missing and value lists to one big table
prep_unsplit_val_tabs(meta_data = "item_level", val_tab = NULL)prep_unsplit_val_tabs(meta_data = "item_level", val_tab = NULL)
meta_data |
data.frame item level meta data to be used, defaults to
|
val_tab |
character name of the table being created: This table will
be added to the data frame cache (or overwritten). If |
data.frame the combined table
Detects factors and converts them to compatible metadata/study data.
prep_valuelabels_from_data(resp_vars = colnames(study_data), study_data)prep_valuelabels_from_data(resp_vars = colnames(study_data), study_data)
resp_vars |
variable names of the variables to fetch the value labels from the data |
study_data |
data.frame the data frame that contains the measurements |
a list with:
VALUE_LABELS: vector of value labels and modified study data
ModifiedStudyData: study data with factors as integers
## Not run: dataquieR::prep_datatype_from_data(iris) ## End(Not run)## Not run: dataquieR::prep_datatype_from_data(iris) ## End(Not run)
Print a dataquieR result returned by dq_report2
## S3 method for class 'dataquieR_result' print(x, ...)## S3 method for class 'dataquieR_result' print(x, ...)
x |
list a dataquieR result from dq_report2 or
|
... |
passed to print. Additionally, the argument |
see print
util_pretty_print()
Generate a RMarkdown-based report from a dataquieR report
## S3 method for class 'dataquieR_resultset' print(...)## S3 method for class 'dataquieR_resultset' print(...)
... |
deprecated |
deprecated
Generate a HTML-based report from a dataquieR report
## S3 method for class 'dataquieR_resultset2' print( x, dir, view = TRUE, disable_plotly = FALSE, block_load_factor = getOption("dataquieR.print_block_load_factor", dataquieR.print_block_load_factor_default), advanced_options = list(), dashboard = NA, force_overwrite = FALSE, ..., cores = list(mode = "socket", logging = FALSE, cpus = util_detect_cores(), load.balancing = TRUE) )## S3 method for class 'dataquieR_resultset2' print( x, dir, view = TRUE, disable_plotly = FALSE, block_load_factor = getOption("dataquieR.print_block_load_factor", dataquieR.print_block_load_factor_default), advanced_options = list(), dashboard = NA, force_overwrite = FALSE, ..., cores = list(mode = "socket", logging = FALSE, cpus = util_detect_cores(), load.balancing = TRUE) )
x |
|
dir |
character directory to store the rendered report's files, a temporary one, if omitted. Directory will be created, if missing |
view |
logical display the report |
disable_plotly |
logical do not use |
block_load_factor |
|
advanced_options |
list options to set during report computation,
see |
dashboard |
logical dashboard mode: |
force_overwrite |
logical force to overwrite |
... |
additional arguments: |
cores |
integer number of cpu cores to use or a named list with arguments for parallelMap::parallelStart or NULL, if parallel has already been started by the caller. Can also be a cluster. |
file names of the generated report's HTML files
dataquieR summaryPrint a dataquieR summary
## S3 method for class 'dataquieR_summary' print( x, ..., grouped_by = c("call_names", "indicator_metric"), dont_print = FALSE, folder_of_report = NULL, vars_to_include = c("study") )## S3 method for class 'dataquieR_summary' print( x, ..., grouped_by = c("call_names", "indicator_metric"), dont_print = FALSE, folder_of_report = NULL, vars_to_include = c("study") )
x |
the |
... |
not yet used |
grouped_by |
define the columns of the resulting matrix. It can be either "call_names", one column per function, or "indicator_metric", one column per indicator or both c("call_names", "indicator_metric"). The last combination is the default |
dont_print |
suppress the actual printing, just return a printable
object derived from |
folder_of_report |
a named vector with the location of variable and
|
vars_to_include |
|
invisible html object
print implementation for the class dataquieR_translated
dataquieR's translated texts featuring access to the language keys, still.
## S3 method for class 'dataquieR_translated' print(x, ...)## S3 method for class 'dataquieR_translated' print(x, ...)
x |
|
... |
passed to base::print |
as print
base::print
DataSlot objectPrint a DataSlot object
## S3 method for class 'DataSlot' print(x, ...)## S3 method for class 'DataSlot' print(x, ...)
x |
the object |
... |
not used |
see print
interval
such objects, for now, only occur in RECCap rules, so this function
is meant for internal use, mostly – for now.
## S3 method for class 'interval' print(x, ...)## S3 method for class 'interval' print(x, ...)
x |
|
... |
not used yet |
the printed object
base::print
dataquieR_result objectsprint a list of dataquieR_result objects
## S3 method for class 'list' print(x, ...)## S3 method for class 'list' print(x, ...)
x |
|
... |
passed to other implementations |
undefined
master_result objectPrint a master_result object
## S3 method for class 'master_result' print(x, template = "default", ...)## S3 method for class 'master_result' print(x, template = "default", ...)
x |
the object |
template |
the template for the |
... |
not used |
invisible(NULL)
Print a number with unit
## S3 method for class 'numeric_with_unit' print(x, ...)## S3 method for class 'numeric_with_unit' print(x, ...)
x |
number with unit |
... |
not used |
invisible(x)
ReportSummaryTable
Use this function to print results objects of the class
ReportSummaryTable.
## S3 method for class 'ReportSummaryTable' print( x, relative = lifecycle::deprecated(), dt = FALSE, fillContainer = FALSE, displayValues = FALSE, view = TRUE, drop = getOption("dataquieR.droplevels_ReportSummaryTable", dataquieR.droplevels_ReportSummaryTable_default), ..., flip_mode = "auto" )## S3 method for class 'ReportSummaryTable' print( x, relative = lifecycle::deprecated(), dt = FALSE, fillContainer = FALSE, displayValues = FALSE, view = TRUE, drop = getOption("dataquieR.droplevels_ReportSummaryTable", dataquieR.droplevels_ReportSummaryTable_default), ..., flip_mode = "auto" )
x |
|
relative |
deprecated |
dt |
logical use |
fillContainer |
logical if |
displayValues |
logical if |
view |
logical if |
drop |
logical if |
... |
not used, yet |
flip_mode |
enum default | flip | noflip | auto. Should the plot be
in default orientation, flipped, not flipped or
auto-flipped. Not all options are always supported.
In general, this con be controlled by
setting the |
the printed object
base::print
Slot objectdisplays all warnings and stuff. then it prints x.
## S3 method for class 'Slot' print(x, ...)## S3 method for class 'Slot' print(x, ...)
x |
the object |
... |
not used |
calls the next print method
StudyDataSlot objectPrint a StudyDataSlot object
## S3 method for class 'StudyDataSlot' print(x, ...)## S3 method for class 'StudyDataSlot' print(x, ...)
x |
the object |
... |
not used |
see print
TableSlot objectPrint a TableSlot object
## S3 method for class 'TableSlot' print(x, ...)## S3 method for class 'TableSlot' print(x, ...)
x |
the object |
... |
not used |
see print
util_pairs_ggplot_panels objectsPrint method for util_pairs_ggplot_panels objects
## S3 method for class 'util_pairs_ggplot_panels' print(x, ...)## S3 method for class 'util_pairs_ggplot_panels' print(x, ...)
x |
An object of class |
... |
Ignored. |
The input object, invisibly.
Checks applicability of DQ functions based on study data and metadata characteristics
pro_applicability_matrix( study_data, item_level = "item_level", split_segments = FALSE, label_col, max_vars_per_plot = 20, meta_data_segment, meta_data_dataframe, flip_mode = "noflip", meta_data_v2, meta_data = item_level, segment_level, dataframe_level )pro_applicability_matrix( study_data, item_level = "item_level", split_segments = FALSE, label_col, max_vars_per_plot = 20, meta_data_segment, meta_data_dataframe, flip_mode = "noflip", meta_data_v2, meta_data = item_level, segment_level, dataframe_level )
study_data |
data.frame the data frame that contains the measurements |
item_level |
data.frame the data frame that contains metadata attributes of study data |
split_segments |
logical return one matrix per study segment |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
max_vars_per_plot |
integer from=0. The maximum number of variables per single plot. |
meta_data_segment |
data.frame – optional: Segment level metadata |
meta_data_dataframe |
data.frame – optional: Data frame level metadata |
flip_mode |
enum default | flip | noflip | auto. Should the plot be
in default orientation, flipped, not flipped or
auto-flipped. Not all options are always supported.
In general, this con be controlled by
setting the |
meta_data_v2 |
character path to workbook like metadata file, see
|
meta_data |
data.frame old name for |
segment_level |
data.frame alias for |
dataframe_level |
data.frame alias for |
This is a preparatory support function that compares study data with associated metadata. A prerequisite of this function is that the no. of columns in the study data complies with the no. of rows in the metadata.
For each existing R-implementation, the function searches for necessary static metadata and returns a heatmap like matrix indicating the applicability of each data quality implementation.
In addition, the data type defined in the metadata is compared with the observed data type in the study data.
a list with:
SummaryTable: data frame about the applicability of each indicator
function (each function in a column).
its integer values can be one of the following four
categories:
0. Non-matching datatype + Incomplete metadata,
1. Non-matching datatype + complete metadata,
2. Matching datatype + Incomplete metadata,
3. Matching datatype + complete metadata,
4. Not applicable according to data type
ApplicabilityPlot: ggplot2::ggplot2 heatmap plot, graphical representation of
SummaryTable
ApplicabilityPlotList: list of plots per (maybe artificial) segment
ReportSummaryTable: data frame underlying ApplicabilityPlot
has one argument, n, reporting the number of steps in the current
job. needed, e.g., by packages, such as progressr.
TODO
Other options:
dataquieR,
dataquieR.CONDITIONS_LEVEL_TRHESHOLD,
dataquieR.CONDITIONS_WITH_STACKTRACE,
dataquieR.ELEMENT_MISSMATCH_CHECKTYPE,
dataquieR.ERRORS_WITH_CALLER,
dataquieR.GAM_for_LOESS,
dataquieR.MAHALANOBIS_THRESHOLD,
dataquieR.MAX_LABEL_LEN,
dataquieR.MAX_LONG_LABEL_LEN,
dataquieR.MAX_VALUE_LABEL_LEN,
dataquieR.MESSAGES_WITH_CALLER,
dataquieR.MULTIVARIATE_OUTLIER_CHECK,
dataquieR.VALUE_LABELS_htmlescaped,
dataquieR.WARNINGS_WITH_CALLER,
dataquieR.acc_loess.exclude_constant_subgroups,
dataquieR.acc_loess.mark_time_points,
dataquieR.acc_loess.min_bw,
dataquieR.acc_loess.min_obs_in_subgroup,
dataquieR.acc_loess.min_proportion,
dataquieR.acc_loess.plot_format,
dataquieR.acc_loess.plot_observations,
dataquieR.acc_margins_num,
dataquieR.acc_margins_sort,
dataquieR.acc_multivariate_outlier.scale,
dataquieR.acc_shape_or_scale_ci,
dataquieR.col_con_con_empirical,
dataquieR.col_con_con_logical,
dataquieR.convert_to_list_for_lapply,
dataquieR.debug,
dataquieR.des_summary_hard_lim_remove,
dataquieR.dontwrapresults,
dataquieR.droplevels_ReportSummaryTable,
dataquieR.dt_adjust,
dataquieR.fix_column_type_on_read,
dataquieR.flip_mode,
dataquieR.force_item_specific_missing_codes,
dataquieR.force_label_col,
dataquieR.grading_formats,
dataquieR.grading_rulesets,
dataquieR.guess_character,
dataquieR.guess_missing_codes,
dataquieR.ignore_empty_vars,
dataquieR.lang,
dataquieR.lazy_plots,
dataquieR.lazy_plots_cache,
dataquieR.lazy_plots_gg_compatibility,
dataquieR.locale,
dataquieR.max_cat_resp_var_levels_in_plot,
dataquieR.max_group_var_levels_in_plot,
dataquieR.max_group_var_levels_with_violins,
dataquieR.min_obs_per_group_var_in_plot,
dataquieR.min_time_points_for_cat_resp_var,
dataquieR.no_geom_count_in_bin,
dataquieR.no_overall_in_bin,
dataquieR.non_disclosure,
dataquieR.old_factor_handling,
dataquieR.old_type_adjust,
dataquieR.precomputeStudyData,
dataquieR.print_block_load_factor,
dataquieR.progress_fkt_default,
dataquieR.progress_msg_fkt_default,
dataquieR.resume_checkpoint,
dataquieR.resume_print,
dataquieR.scale_level_heuristics_control_binaryrecodelimit,
dataquieR.scale_level_heuristics_control_metriclevels,
dataquieR.study_data_cache_max,
dataquieR.study_data_cache_metrics,
dataquieR.study_data_cache_metrics_env,
dataquieR.study_data_cache_quick_fill,
dataquieR.study_data_colnames_case_sensitive,
dataquieR.testdebug,
dataquieR.traceback,
dataquieR.type_adjust_parallel
ReportSummaryTable outputsUsing this rbind implementation, you can combine different
heatmap-like results of the class ReportSummaryTable.
## S3 method for class 'ReportSummaryTable' rbind(...)## S3 method for class 'ReportSummaryTable' rbind(...)
... |
|
Specifies the type of reliability or validity analysis. The string specifies the analysis algorithm to be used, and can be either "inter-class" or "intra-class".
REL_VALREL_VAL
Other meta_data_cross:
ASSOCIATION_DIRECTION,
ASSOCIATION_FORM,
ASSOCIATION_METRIC,
ASSOCIATION_RANGE,
CHECK_ID,
CHECK_LABEL,
COMPUTED_VARIABLE_ROLES,
CONTRADICTION_TERM,
CONTRADICTION_TYPE,
DATA_PREPARATION,
GOLDSTANDARD,
IRV,
MAHALANOBIS_RATIO,
MAHALANOBIS_THRESHOLD,
MAXIMUM_LONG_STRING,
MISS_RESP,
MULTIVARIATE_OUTLIER_CHECK,
MULTIVARIATE_OUTLIER_CHECKTYPE,
RELCOMPL_SPEED,
RESPT_PER_ITEM,
SCALE_ACRONYM,
SCALE_NAME,
TOTRESPT,
VARIABLE_LIST,
VARIABLE_LIST_ORDER,
meta_data_computation,
meta_data_cross
TODO
TODO
Other meta_data_cross:
ASSOCIATION_DIRECTION,
ASSOCIATION_FORM,
ASSOCIATION_METRIC,
ASSOCIATION_RANGE,
CHECK_ID,
CHECK_LABEL,
COMPUTED_VARIABLE_ROLES,
CONTRADICTION_TERM,
CONTRADICTION_TYPE,
DATA_PREPARATION,
GOLDSTANDARD,
IRV,
MAHALANOBIS_RATIO,
MAHALANOBIS_THRESHOLD,
MAXIMUM_LONG_STRING,
MISS_RESP,
MULTIVARIATE_OUTLIER_CHECK,
MULTIVARIATE_OUTLIER_CHECKTYPE,
REL_VAL,
RESPT_PER_ITEM,
SCALE_ACRONYM,
SCALE_NAME,
TOTRESPT,
VARIABLE_LIST,
VARIABLE_LIST_ORDER,
meta_data_computation,
meta_data_cross
Other SSI:
COMPUTED_VARIABLE_ROLES,
IRV,
MAHALANOBIS_RATIO,
MAXIMUM_LONG_STRING,
MISS_RESP,
RESPT_PER_ITEM,
TOTRESPT
Return names of result slots (e.g., 3rd dimension of dataquieR results)
resnames(x)resnames(x)
x |
the objects |
character vector with names
Return names of result slots (e.g., 3rd dimension of dataquieR results)
## S3 method for class 'dataquieR_resultset2' resnames(x)## S3 method for class 'dataquieR_resultset2' resnames(x)
x |
the objects |
character vector with names
TODO
Other meta_data_cross:
ASSOCIATION_DIRECTION,
ASSOCIATION_FORM,
ASSOCIATION_METRIC,
ASSOCIATION_RANGE,
CHECK_ID,
CHECK_LABEL,
COMPUTED_VARIABLE_ROLES,
CONTRADICTION_TERM,
CONTRADICTION_TYPE,
DATA_PREPARATION,
GOLDSTANDARD,
IRV,
MAHALANOBIS_RATIO,
MAHALANOBIS_THRESHOLD,
MAXIMUM_LONG_STRING,
MISS_RESP,
MULTIVARIATE_OUTLIER_CHECK,
MULTIVARIATE_OUTLIER_CHECKTYPE,
RELCOMPL_SPEED,
REL_VAL,
SCALE_ACRONYM,
SCALE_NAME,
TOTRESPT,
VARIABLE_LIST,
VARIABLE_LIST_ORDER,
meta_data_computation,
meta_data_cross
Other SSI:
COMPUTED_VARIABLE_ROLES,
IRV,
MAHALANOBIS_RATIO,
MAXIMUM_LONG_STRING,
MISS_RESP,
RELCOMPL_SPEED,
TOTRESPT
Cross-item level metadata attribute name TODO
SCALE_ACRONYMSCALE_ACRONYM
Other meta_data_cross:
ASSOCIATION_DIRECTION,
ASSOCIATION_FORM,
ASSOCIATION_METRIC,
ASSOCIATION_RANGE,
CHECK_ID,
CHECK_LABEL,
COMPUTED_VARIABLE_ROLES,
CONTRADICTION_TERM,
CONTRADICTION_TYPE,
DATA_PREPARATION,
GOLDSTANDARD,
IRV,
MAHALANOBIS_RATIO,
MAHALANOBIS_THRESHOLD,
MAXIMUM_LONG_STRING,
MISS_RESP,
MULTIVARIATE_OUTLIER_CHECK,
MULTIVARIATE_OUTLIER_CHECKTYPE,
RELCOMPL_SPEED,
REL_VAL,
RESPT_PER_ITEM,
SCALE_NAME,
TOTRESPT,
VARIABLE_LIST,
VARIABLE_LIST_ORDER,
meta_data_computation,
meta_data_cross
Stevens's TypologyIn the metadata, the following entries are allowed for the variable attribute SCALE_LEVEL:
SCALE_LEVELSSCALE_LEVELS
nominal for categorical variables
ordinal for ordinal variables (i.e., comparison of values is possible)
interval for interval scales, i.e., distances are meaningful
ratio for ratio scales, i.e., ratios are meaningful
na for variables, that contain e.g. unstructured texts, json,
xml, ... to distinguish them from variables, that still need to
have the SCALE_LEVEL estimated by
prep_scalelevel_from_data_and_metadata()
sex, eye color – nominal
income group, education level – ordinal
temperature in degree Celsius – interval
body weight, temperature in Kelvin – ratio
Cross-item level metadata attribute name TODO
SCALE_NAMESCALE_NAME
Other meta_data_cross:
ASSOCIATION_DIRECTION,
ASSOCIATION_FORM,
ASSOCIATION_METRIC,
ASSOCIATION_RANGE,
CHECK_ID,
CHECK_LABEL,
COMPUTED_VARIABLE_ROLES,
CONTRADICTION_TERM,
CONTRADICTION_TYPE,
DATA_PREPARATION,
GOLDSTANDARD,
IRV,
MAHALANOBIS_RATIO,
MAHALANOBIS_THRESHOLD,
MAXIMUM_LONG_STRING,
MISS_RESP,
MULTIVARIATE_OUTLIER_CHECK,
MULTIVARIATE_OUTLIER_CHECKTYPE,
RELCOMPL_SPEED,
REL_VAL,
RESPT_PER_ITEM,
SCALE_ACRONYM,
TOTRESPT,
VARIABLE_LIST,
VARIABLE_LIST_ORDER,
meta_data_computation,
meta_data_cross
The name of the data frame containing the reference IDs to be compared with the IDs in the targeted segment.
SEGMENT_ID_REF_TABLESEGMENT_ID_REF_TABLE
The name of the data frame containing the reference IDs to be compared with the IDs in the targeted segment.
SEGMENT_ID_TABLESEGMENT_ID_TABLE
Please use SEGMENT_ID_REF_TABLE
All variables that are to be used as one single ID variable (combined key) in a segment.
SEGMENT_ID_VARSSEGMENT_ID_VARS
true or false to suppress crude segment missingness output
(Completeness/Misg. Segments in the report). Defaults to compute
the output, if more than one segment is available in the item-level
metadata.
SEGMENT_MISSSEGMENT_MISS
The name of the segment participation status variable
SEGMENT_PART_VARSSEGMENT_PART_VARS
The type of check to be conducted when comparing the reference ID table with the IDs in a segment.
SEGMENT_RECORD_CHECKSEGMENT_RECORD_CHECK
Number of expected data records in each segment. numeric. Check only conducted if number entered
SEGMENT_RECORD_COUNTSEGMENT_RECORD_COUNT
Segment level metadata attribute name
SEGMENT_UNIQUE_IDSEGMENT_UNIQUE_ID
Specifies whether identical data is permitted across rows in a segment (excluding ID variables)
SEGMENT_UNIQUE_ROWSSEGMENT_UNIQUE_ROWS
This 1 character is according to our metadata concept "|".
SPLIT_CHARSPLIT_CHAR
Study data is expected in wide format. If should contain all variables for all segments in one large table, even, if some variables are not measured for all observational utils (study participants).
Deprecated
## S3 method for class 'dataquieR_resultset' summary(...)## S3 method for class 'dataquieR_resultset' summary(...)
... |
Deprecated |
Deprecated
Generate a report summary table
## S3 method for class 'dataquieR_resultset2' summary( object, aspect = c("applicability", "error", "anamat", "indicator_or_descriptor"), FUN, collapse = "\n<br />\n", ... )## S3 method for class 'dataquieR_resultset2' summary( object, aspect = c("applicability", "error", "anamat", "indicator_or_descriptor"), FUN, collapse = "\n<br />\n", ... )
object |
a square result set |
aspect |
an aspect/problem category of results |
FUN |
function to apply to the cells of the result table |
collapse |
passed to |
... |
not used |
a summary of a dataquieR report
## Not run: util_html_table(summary(report), filter = "top", options = list(scrollCollapse = TRUE, scrollY = "75vh"), is_matrix_table = TRUE, rotate_headers = TRUE ) ## End(Not run)## Not run: util_html_table(summary(report), filter = "top", options = list(scrollCollapse = TRUE, scrollY = "75vh"), is_matrix_table = TRUE, rotate_headers = TRUE ) ## End(Not run)
Internally used point-range
to_basic.GeomPointrangeRobust(data, prestats_data, layout, params, p, ...)to_basic.GeomPointrangeRobust(data, prestats_data, layout, params, p, ...)
data |
the data returned by |
prestats_data |
the data before statistics are computed. |
layout |
the panel layout. |
params |
parameters for the geom, statistic, and 'constant' aesthetics |
p |
a ggplot2 object (the conversion may depend on scales, for instance). |
... |
currently ignored |
TODO
Other meta_data_cross:
ASSOCIATION_DIRECTION,
ASSOCIATION_FORM,
ASSOCIATION_METRIC,
ASSOCIATION_RANGE,
CHECK_ID,
CHECK_LABEL,
COMPUTED_VARIABLE_ROLES,
CONTRADICTION_TERM,
CONTRADICTION_TYPE,
DATA_PREPARATION,
GOLDSTANDARD,
IRV,
MAHALANOBIS_RATIO,
MAHALANOBIS_THRESHOLD,
MAXIMUM_LONG_STRING,
MISS_RESP,
MULTIVARIATE_OUTLIER_CHECK,
MULTIVARIATE_OUTLIER_CHECKTYPE,
RELCOMPL_SPEED,
REL_VAL,
RESPT_PER_ITEM,
SCALE_ACRONYM,
SCALE_NAME,
VARIABLE_LIST,
VARIABLE_LIST_ORDER,
meta_data_computation,
meta_data_cross
Other SSI:
COMPUTED_VARIABLE_ROLES,
IRV,
MAHALANOBIS_RATIO,
MAXIMUM_LONG_STRING,
MISS_RESP,
RELCOMPL_SPEED,
RESPT_PER_ITEM
units::valid_udunits()
see column def, therein
like %, ppt, ppm
Other UNITS:
UNITS,
UNIT_PREFIXES,
UNIT_PREFIX_FACTORS,
UNIT_SOURCES,
WELL_KNOWN_META_VARIABLE_NAMES
units::valid_udunits_prefixes()
named numeric vector
translates k, m, M, c, ... to 1000, 0.001, ...
Other UNITS:
UNITS,
UNIT_IS_COUNT,
UNIT_PREFIXES,
UNIT_SOURCES,
WELL_KNOWN_META_VARIABLE_NAMES
units::valid_udunits()
see column source_xml therein, i.e., base, derived, accepted, or common
Other UNITS:
UNITS,
UNIT_IS_COUNT,
UNIT_PREFIXES,
UNIT_PREFIX_FACTORS,
WELL_KNOWN_META_VARIABLE_NAMES
data.frame with the following columns:
CODE_VALUE: numeric | DATETIME Missing or categorical code
(the number or date representing a
missing/category)
CODE_LABEL: character a label for the missing code or category
CODE_CLASS: enum JUMP | MISSING. For missing lists: Class of the
missing code.
CODE_INTERPRET enum I | P | PL | R | BO | NC | O | UH | UO | NE.
For missing lists: Class of the missing code
according to
AAPOR.
resp_vars: character For missing lists: optional, if a missing code
is specific for some
variables, it is listed for each such variable
with one entry in resp_vars, If NA, the
code is assumed shared among all variables.
For v1.0 metadata, you need to refer to
VAR_NAMES here.
com_qualified_item_missingness()
com_qualified_segment_missingness()
con_inadmissible_categorical()
These levels are cumulatively used by the function prep_create_meta and
related in the argument level therein.
VARATT_REQUIRE_LEVELSVARATT_REQUIRE_LEVELS
currently available:
'COMPATIBILITY' = "compatibility"
'REQUIRED' = "required"
'RECOMMENDED' = "recommended"
'OPTIONAL' = "optional"
'TECHNICAL' = "technical"
Specifies a group of variables for multivariate analyses. Separated
by |, please use variable names from VAR_NAMES or
a label as specified in label_col, usually LABEL or LONG_LABEL.
VARIABLE_LISTVARIABLE_LIST
if missing, dataquieR will create such IDs from CONTRADICTION_TERM,
if specified.
Other meta_data_cross:
ASSOCIATION_DIRECTION,
ASSOCIATION_FORM,
ASSOCIATION_METRIC,
ASSOCIATION_RANGE,
CHECK_ID,
CHECK_LABEL,
COMPUTED_VARIABLE_ROLES,
CONTRADICTION_TERM,
CONTRADICTION_TYPE,
DATA_PREPARATION,
GOLDSTANDARD,
IRV,
MAHALANOBIS_RATIO,
MAHALANOBIS_THRESHOLD,
MAXIMUM_LONG_STRING,
MISS_RESP,
MULTIVARIATE_OUTLIER_CHECK,
MULTIVARIATE_OUTLIER_CHECKTYPE,
RELCOMPL_SPEED,
REL_VAL,
RESPT_PER_ITEM,
SCALE_ACRONYM,
SCALE_NAME,
TOTRESPT,
VARIABLE_LIST_ORDER,
meta_data_computation,
meta_data_cross
Cross-item level metadata attribute name TODO internal use, only
VARIABLE_LIST_ORDERVARIABLE_LIST_ORDER
Other meta_data_cross:
ASSOCIATION_DIRECTION,
ASSOCIATION_FORM,
ASSOCIATION_METRIC,
ASSOCIATION_RANGE,
CHECK_ID,
CHECK_LABEL,
COMPUTED_VARIABLE_ROLES,
CONTRADICTION_TERM,
CONTRADICTION_TYPE,
DATA_PREPARATION,
GOLDSTANDARD,
IRV,
MAHALANOBIS_RATIO,
MAHALANOBIS_THRESHOLD,
MAXIMUM_LONG_STRING,
MISS_RESP,
MULTIVARIATE_OUTLIER_CHECK,
MULTIVARIATE_OUTLIER_CHECKTYPE,
RELCOMPL_SPEED,
REL_VAL,
RESPT_PER_ITEM,
SCALE_ACRONYM,
SCALE_NAME,
TOTRESPT,
VARIABLE_LIST,
meta_data_computation,
meta_data_cross
intro a variable holding consent-data
primary a primary outcome variable
secondary a secondary outcome variable
process a variable describing the measurement process
suppress a variable added on the fly computing sub-reports, i.e., by
dq_report_by to have all referred variables available,
even if they are not part of the currently processed segment.
But they will only be fully assessed in their real segment's
report.
VARIABLE_ROLESVARIABLE_ROLES
names of the variable attributes in the metadata frame holding the names of the respective observers, devices, lower limits for plausible values, upper limits for plausible values, lower limits for allowed values, upper limits for allowed values, the variable name (column name, e.g. v0020349) used in the study data, the variable name used for processing (readable name, e.g. RR_DIAST_1) and in parameters of the QA-Functions, the variable label, variable long label, variable short label, variable data type (see also DATA_TYPES), re-code for definition of lists of event categories, missing lists and jump lists as CSV strings. For valid units see UNITS.
WELL_KNOWN_META_VARIABLE_NAMESWELL_KNOWN_META_VARIABLE_NAMES
all entries of this list will be mapped to the package's exported NAMESPACE environment directly, i.e. they are available directly by their names too:
meta_data_segment for STUDY_SEGMENT
Other UNITS:
UNITS,
UNIT_IS_COUNT,
UNIT_PREFIXES,
UNIT_PREFIX_FACTORS,
UNIT_SOURCES
print(WELL_KNOWN_META_VARIABLE_NAMES$VAR_NAMES) # print(VAR_NAMES) # should usually also workprint(WELL_KNOWN_META_VARIABLE_NAMES$VAR_NAMES) # print(VAR_NAMES) # should usually also work