winbuilder fails.MultinomialCI package. Confidence
intervals used by acc_shape_or_scale() and acc_end_digits() are now
computed internally. The method can be selected with the option
dataquieR.acc_shape_or_scale_ci.dq_report_by() overview and subgroup handling, including more
informative error messages for invalid subgroup rules.LOESS plots and sunburst charts.geom_count are no longer shown. This can be controlled with
the new no_overall_in_bin and no_geom_count_in_bin arguments and options.JavaScript containing variable labels with
quotesJavaScript and css in outputcss in output2.8.3.jsPDF HTML dependency; visNetwork is now suggested insteadS7 renderingoptions()dq_report_by() no longer errors if plotly is missing but disables
plotly functionality gracefullyggplot2 objects are now returned as lightweight promises (dq_lazy_ggplot)
to avoid serializing large S7 objects
prep_realize_ggplot(p)S7 compatibility can be reduced viaoptions(dataquieR.lazy_plots_gg_compatibility = "FALSE"), which may
require more frequent calls to prep_realize_ggplot()If you use saved report objects created with older
ggplot2 / patchwork versions, they may stop working. Either
recompute them or temporarily downgrade:
remotes::install_version("patchwork", version = "1.3.0")
remotes::install_version("ggplot2", version = "3.5.2")
This workaround is only intended for restoring compatibility with previously saved report objects. No guarantee is given and you use it at your own risk. For long-term use, we recommend recomputing the reports with current package versions.
character
first (controlled via option dataquieR.old_factor_handling)cross-item_level metadataHARD_LIMITS, SOFT_LIMITS, DETECTION_LIMITS).PCT_acc_ud_loc from acc_margins() and grading rulesetsFLG_acc_ud_loc from acc_margins()des_summary():
"No. categories/Freq. table" split into"No. categories (incl. NAs)" and "Level_freq""Variables" renamed "Variable_names"plain_label)GGallyVALUE_LABELSprep_init_parallel_print() removed (no longer needed)News
threshold_value from acc_varcomp()loess and margins plot slightly improvedAmendment to 2.5.0 news
threshold_value
from acc_varcomp()New features
dq_report2() can store results on the disk instead of the RAM with the
new argument storr_factory. This can be useful in reducing issues of
memory consumption, but we suggest to use fast SSDs or NVMesoptions(dataquieR.dontwrapresults = TRUE).
With options(dataquieR.testdebug = TRUE), you can switch off this
behavior.dataquieR can provision your function arguments from the metadata.
In order to enable lapply and Vectorize(SIMPLIFY = FALSE) with
indicator functions, the first argument is now always
resp_vars for item level functions.
dataquieR tries to guess if a function that features both resp_vars and
study_data as its first arguments was called w/o resp_vars but only with
study_data as its first unnamed argument. If that is the case, it sets
resp_vars to the default for resp_vars (typically all variables).
With options(dataquieR.testdebug = TRUE), you can switch off this
behavior, if you need.dq_report_by, in which it is possible to specify:
resp_vars)id_vars)int_encoding_errors checking invalid characters present in
the text with respect to the expected character encoding / code page,
e.g., a code place in the latin1 table is used but the encoding
is utf8 resulting in damaged text outputItem-level data quality dashboard,
usable to customize data summariesCODE_LIST_TABLE in the metadata,
where it is possible to state both value label tables and
missing list tables all in one table.item_computation_level in the metadata,
where it is possible to state variables to be computed from the provided
study data.Breaking changes
prep_get_data_frame("ship") or
prep_get_data_frame("study_data") in your code to access example data,
no change is needed. If you are still accessing example data using
system.file() (e.g. using
load(system.file("extdata", "study_data.RData", package = "dataquieR"))),
you need to switch to prep_get_data_frame(), i.e.:
load(system.file("extdata", "study_data.RData", package = "dataquieR"))
would become study_data <- prep_get_data_frame("study_data")SummaryData in ResultData (functions: acc_shape_or_scale,
acc_margins, com_segment_missingness)GRADING from SummaryData outputs.
SummaryTable outputs still feature the column, since these are meant
to be a machine readable interfacecon_contradictions_redcap used to return a result named SummaryTable,
while the documentation spoke about SummaryData. Alas, it should have
been VariableGroupTable in both cases.
If you relied on SummaryTable in the results of
con_contradictions_redcap, you need to change your code
to use now the correct output name VariableGroupTable. Also, the table
has been slightly modified.VariableGroupData as returned by con_contradictions_redcap is a
version optimized for human readers.VariableGroupTable as returned by con_contradictions_redcap
the column category has been renamed to CONTRADICTION_TYPEcon_contradictions_redcap, if summarize_categories is selected
the result will now be in a sub-list named Otherprep_add_computed_variables, the column resp_vars is now named
VAR_NAMES, to be more in line with other data frames.Reporting
plotly's
interactive figures[.dataquieR_resultset2 and [[.dataquieR_result and
related functions have changed slightly. You can now for a
report (r <- dq_report2(...)) call, e.g.,r[, "com_item_missingness", "ReportSummaryTable"] to get a balloon plot or
r[, "com_item_missingness", "SummaryData"] to get a table, for all
variables that were assessed with com_item_missingness() in the report rdataquieR_result objects, these will be combined,
but due to restrictions in R, this only works, if you call print()
explicitly on this list, not with "auto-printing" (see
https://stackoverflow.com/a/53983005), for example:a <- lapply(c("v00001", "v00004", "v00005", "v00006"), acc_loess, meta_data_v2 = "meta_data_v2", study_data = "study_data")
print(a) works, but typing a alone does not.
You have to call print() or to put lapply() in brackets:
(lapply())(Indicator) Functions related
acc_distributions() was split in acc_distributions() and
acc_distributions_ecdf()
(prep_acc_distributions_with_ecdf() creates the original plot)acc_cat_distributions()meta_data_v2 argumentitem_level, as synonyms for meta_data,
new argument segment_level, as synonyms for meta_data_segment,
new argument dataframe_level, as synonyms for meta_data_dataframe,
new argument cross-item_level, as synonyms for meta_data_cross_item,
new argument item_computation_level, as synonyms for
meta_data_item_computationlabel_col, the label_col will now
default to LABEL, except you set the option
options(dataquieR.testdebug = TRUE) or
options(dataquieR.dontwrapresults = TRUE)resp_vars in prep_scalelevel_from_data_and_metadata() was
never working correctly and not used neither, so it has been deprecated.
It is already not functional and it never wasdes_summary is still present, but you can now get results for
continuous or categorical variables only, using
des_summary_continuous and des_summary_categoricalrespectivelycon_contradictions_redcap plot colors vary depending
on CONTRADICTION_TYPESacc_loess() uses lowess instead of loess (both from the stats
package)General
prep_check_for_dataquieR_updates(), so, maybe, you need to
manually install the latest beta release using
devtools::install_gitlab("libreumg/dataquieR", auth_token = NULL)options(dataquieR.ELEMENT_MISSMATCH_CHECKTYPE = "subset_u") is now the
default assuming a one-fits-all-metadata-file (see
? dataquieR.ELEMENT_MISSMATCH_CHECKTYPE)rlang or withr,
most prominently a faster prep_prepare_dataframes() and rlang compatible
condition (error) handling.dataquieR_result class, which is
now applied also to results outside a pipeline.SEGMENT_ID_TABLE to SEGMENT_ID_REF_TABLE in
segment level metadata
dq_report_by files structureHTML reportsCODE_INTERPRET changed to be in line with the
AAPOR definitions, so the following translation:
PP -> P; P -> I; OH -> UOprep_save_report and prep_load_report
HTML/JS output for Firefoxplot.ly-plotsgginnards installed; removed dependency from gginnards.robustbase about doScaledq_report2 reportssummarytools are included in dq_report2 reports, if installed.HTML generation prepareddq_report2 using a queue improves speedVARIABLE_ROLES in dq_report2 and suppressing helper variable outputs in dq_report_bydq_report2 and not directly by the userdq_report2 because it is not so useful in its current implementationdq_report_by for large reports (can write and optionally render results to disk rather than returning them)dq_report_by causing DATA_PROCESS not to workTODO's in dq_report_by and add dependent variables on the fly but with VARIABLE_ROLE suppress:
dq_report_bydq_report_byfilter_result_slots in dq_report2)JS-table prevented controlling the tableVARIABLE_ROLES filtered itemsUNIVARIATE_OUTLIER_CHECKTYPE and MULTIVARIATE_OUTLIER_CHECKTYPEREDCap syntax: strictly_successive_dates and successive_datesREDCap rules and NA handling and DATA_PROCESS.use_value_labels is not supported anymore. You can specify the behavior on the rules level in the new cross-item-level metadata column DATA_PREPARATIONEND_DIGIT_CHECK in dq_report2, (DATA_ENTRY_TYPE is still supported and auto-converted). If missing, END_DIGIT_CHECK defaults to FALSENA were in the dataJUMP_LIST could be added to the item-level metadata if missing, but causing this type of failing rulesWindows and uncommon variable namesprep_load_workbook_like_file and meta_data_v2 = formal in dq_report2) supporting http and https URLs (e.g., Excel or OpenOffice workbooks)
dq_report2 replaces dq_report. Please use dq_report2 from now on.
htmtools and supports plotly)data.frame, and cross-item levels). No required action by user, previous version still supportedREDCap rules for contradictions (cross-item level metadata), previous contradictions function still supporteddata.frame-level metadata)AAPOR conceptacc_univariate_outlier and acc_multivariate_outlier now allow selecting the methods used to flag outlierswhoami is installed, reports now show a more suitable user name~ from the ggplot2 updates causing acc_margins to
fail for categorical variablesdq_report reports with wrong brackets
ggplot2 3.4.0ORCIDs for two authorsCITATION fileREADME.md file adding the funding sources.NEWS.md file
sigmagap and made missing guessing more robust.
logical.acc_margins.GRADING columns.rbind.ReportSummaryTable since these are
not needed anyways and the inherited documentation for those arguments
rbind from base contains an invalid URL triggering a NOTE.int_datatype_matrix.prep_study2meta can now also convert factors to dataquieR compatible
meta_data/study_datacom_item_missingness for textual response variables.DT JS is always loaded when a dq_report report is
rendered
com_segment_missingness with
strata_vars / group_vars did not worklabel_col was set to something else than LABEL,
strata_vars did not work for com_unit_missingnessdq_report.cowplot to patchwork in acc_margins yielding figures
that can be easier manipulated. Please note, that this change could break
existing output manipulations, since the structure of the margins plots
has changed internally. However, output manipulations were hardly
possible for margins plots before, so it is unlikely, that there
are pipelines affected.acc_loess function.prep_create_meta handling length-0 arguments by ignoring
these variable attributes at all.con_inadmissible_categorical (one resp_var only and
value-limits all the same for all resp_vars)
README-Filepandoc-less systems
dataquieR function was called
by a generated function f that lives in an environment
directly inheriting from the empty environment, e.g.
environment(f) <- new.env(parent = emptyenv()).dontrun, because they sometimes caused NOTEs
on rhub.SummaryTable entry
of a result within a dq_report output, the summary and also
print generic did not work on the report.devtools::check(cran = TRUE, env_vars = c(NOT_CRAN = "false"))
takes 2:22 minutes now.