VIF measure how much the variance of the estimated regression coefficients are inflated. It helps to identify when the predictor variables are linearly related. You have to decide which variable should be delete. Usually values higher than 10 (around), mean a collinearity problem.
VIF(x, posit_y, posit_x, correct = FALSE)
# S3 method for data.frame
VIF(x, posit_y, posit_x, correct = FALSE)
# S3 method for data.table
VIF(x, posit_y, posit_x, correct = FALSE)
# S3 method for matrix
VIF(x, posit_y, posit_x, correct = FALSE)
a numeric matrix or data.frame/data.table (factor/character/numeric) - variables
an integer/character - a position/name of dependent variable. This variable is taken into account only for getting complete cases.
an integer/character vector - positions/names of independent variables
a boolean - basic or corrected - Default: FALSE
load a numeric vector with VIF for all variables provided by posit_x
VIF(data.frame)
:
VIF(data.table)
:
VIF(matrix)
:
vif_corrected = vif_basic^(1/(2*df))
if (FALSE) {
library(miceFast)
library(data.table)
airquality2 <- airquality
airquality2$Temp2 <- airquality2$Temp**2
airquality2$Month <- factor(airquality2$Month)
data_DT <- data.table(airquality2)
data_DT[, .(vifs = VIF(
x = .SD,
posit_y = "Ozone",
posit_x = c("Solar.R", "Wind", "Temp", "Month", "Day", "Temp2"),
correct = FALSE
))][["vifs.V1"]]
data_DT[, .(vifs = VIF(
x = .SD,
posit_y = 1,
posit_x = c(2, 3, 4, 5, 6, 7),
correct = TRUE
))][["vifs.V1"]]
}