Kota Hattori, thank you for your feedback and for motivating me for this deep update.
New features
-
pool() function for combining results from multiply imputed datasets (Rubin’s rules, Barnard-Rubin df adjustment). Works with lm, glm, and other models that support coef() and vcov(). Validated against mice.
-
print and summary methods for pooled results.
Bug fixes
- fixed residual variance estimator in
lm_noise and lm_bayes stochastic models: divisor changed from n-p-1 to n-p, where p already counts the intercept column supplied by the user. The previous formula over-corrected by one degree of freedom.
Documentation and internals
- new vignette on missing data mechanisms (MCAR/MAR/MNAR) and MI workflows.
- refactored introduction vignette with
pool() examples.
- improved README with MI section and benchmark table.
- test suite for
pool(), including comparison against mice::pool().
- new weighted regression validation test against
lm.wfit().
- refactored C++ source code for clarity.
- fixed typos in error messages and documentation.
- regenerated performance benchmarks on R 4.4.3, macOS M3 Pro.
- cran related update,
OMP_THREAD_LIMIT.
- fixed CRAN Notes.
- style the cpp code.
- VIF() should be more stable.
- simplified
naive_fill_NA, It is a regular sampling imputation now.
- Fixed
dontrun examples.
- replace
ggplot2::aes_string with ggplot2::aes, as the former is depreciated.
- regenerate performance benchmarks on R 4.2.1.
- styler over the code.
- improve documentation.
-
tinyverse world, less dependencies.
- fixed imputations for character variables under linear models.
- speed up the
pmm model.
- more tests, higher
covr.
- rerun performance tests.
- update URL inside README.
- improve coverage.
- use drop = FALSE when subsetting the data.frame
- healthy DESCRIPTION file, fix spaces.
- more input validation.
- update broken vignette links
- solve broken UpSetR::upset reference links
- upset_NA based on UpSetR::upset plot function
- compare_imp plot function
- new logo
- remove times argument
- R CRAN r-oldrel-windows-ix86+x86_64 problems
- fill_NA_N has a new model which is pmm - predictive mean matching
- fast PMM - presorting and binary search
- naive_fill_NA - auto function for data.frames - bayes mean and lda
- ridge argument for lm models - adding small disturbance to diag of X’X
- lm_bayes provide more disturbance
- new tests
- codecov
- remove old urls form vignettes
- providing a more comfortable environment for data.table/dplyr users
- expand vignette and documentation
- updated performance benchmarks
- fix a glitch - e.g. lack of correct warning for a lda model with zero variance variables
- data.table problem - jump to R 3.5.0
- valgrind - a lot of optimizations - problem with arma::exp and arma::randn
- optimize a lot of code
- methods/functions resistant to glitches
- fix imputations with a grouping variable - error if there is precisly one NA at any group
- add data.table to benchmarks - model with a grouping variable
- add R functions (
fill_NA_N,fill_NA,VIF) which could be used by a data.table user
- add
impute_N method - optimized multiple imputations
- add
vif method - Variance inflation factors
- vignette,readme,description,todo
- adjust to solaris
- reference - set a grouping variable by a reference but as a numeric vector - integer vector do not work (randomly lost pointer)