Skip to contents

Possible to generate several y's and to re-scale residuals. Regression fitting by a sparse matrix algorithm is also possible (see reference).

Usage

IpsoExtra(
  y,
  x = NULL,
  ensureIntercept = TRUE,
  returnParts = FALSE,
  nRep = 1,
  resScale = NULL,
  digits = 9,
  rmse = NULL,
  sparseLimit = 500,
  printInc = TRUE
)

Arguments

y

Matrix of confidential variables

x

Matrix of non-confidential variables

ensureIntercept

Whether to ensure/include a constant term. Non-NULL x is subjected to EnsureIntercept

returnParts

Alternative output two matrices: yHat (fitted) and yRes (generated residuals).

nRep

Integer, when >1, several y's will be generated. Extra columns in output.

resScale

Residuals will be scaled by resScale

digits

Digits used to detect perfect fit (caused by fitted values as input). This checking will be done only when rmse is in input. When perfect fit, rmse will be used instead of resScale.

rmse

Desired root mean square error (residual standard error). Will be used when resScale is NULL or cannot be used (see parameter digits). This parameter forces the rmse value for one y variable (the first).

sparseLimit

Limit for the number of rows of a reduced x-matrix within the algorithm. When exceeded, a sparse algorithm is used (see reference).

Value

Generated version of y

References

Douglas Bates and R Development Core Team (2022), Comparing Least Squares Calculations, R Vignette, vignette("Comparisons", package="Matrix").

Author

Øyvind Langsrud

Examples

x <- matrix(1:5, 5, 1)
y <- matrix(10 * (sample(7:39, 15) + 4 * (1:15)), 5, 3)
colnames(y) <- paste("y", 1:3, sep = "")
y1 <- y[, 1, drop = FALSE]

IpsoExtra(y, x)  # Same as RegSDCipso(y, x)
#>            y1       y2       y3
#> [1,] 170.6112 536.1386 601.1863
#> [2,] 289.6568 475.8284 799.4067
#> [3,] 458.1592 499.4642 920.2322
#> [4,] 342.2664 399.0320 696.5705
#> [5,] 479.3064 699.5368 862.6043

IpsoExtra(y, x, resScale = 0)  # Fitted values (whole numbers in this case)
#>       y1  y2  y3
#> [1,] 214 472 692
#> [2,] 281 497 734
#> [3,] 348 522 776
#> [4,] 415 547 818
#> [5,] 482 572 860
IpsoExtra(y, x, nRep = 2, resScale = 1e-05)  # Downscaled residuals 
#>            y1       y2       y3       y1       y2       y3
#> [1,] 213.9996 472.0002 691.9991 213.9997 472.0004 691.9999
#> [2,] 281.0010 497.0005 734.0018 280.9998 496.9987 733.9993
#> [3,] 347.9995 521.9983 775.9995 348.0005 522.0015 776.0005
#> [4,] 414.9994 547.0011 817.9993 415.0009 546.9995 818.0016
#> [5,] 482.0004 571.9999 860.0003 481.9991 572.0000 859.9987

ySynth <- IpsoExtra(y1, x, nRep = 2, rmse = 0.25)  # Downscaled residuals 
summary(lm(ySynth ~ x))  # Identical regression results with Residual standard error: 0.25
#> Response y1 :
#> 
#> Call:
#> lm(formula = y1 ~ x)
#> 
#> Residuals:
#>        1        2        3        4        5 
#> -0.24471  0.34076 -0.03998  0.03651 -0.09258 
#> 
#> Coefficients:
#>              Estimate Std. Error t value Pr(>|t|)    
#> (Intercept) 147.00000    0.26220   560.6 1.25e-08 ***
#> x            67.00000    0.07906   847.5 3.62e-09 ***
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> Residual standard error: 0.25 on 3 degrees of freedom
#> Multiple R-squared:      1,	Adjusted R-squared:      1 
#> F-statistic: 7.182e+05 on 1 and 3 DF,  p-value: 3.623e-09
#> 
#> 
#> Response y1 :
#> 
#> Call:
#> lm(formula = y1 ~ x)
#> 
#> Residuals:
#>        1        2        3        4        5 
#>  0.09374 -0.30613  0.18605  0.17133 -0.14499 
#> 
#> Coefficients:
#>              Estimate Std. Error t value Pr(>|t|)    
#> (Intercept) 147.00000    0.26220   560.6 1.25e-08 ***
#> x            67.00000    0.07906   847.5 3.62e-09 ***
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> Residual standard error: 0.25 on 3 degrees of freedom
#> Multiple R-squared:      1,	Adjusted R-squared:      1 
#> F-statistic: 7.182e+05 on 1 and 3 DF,  p-value: 3.623e-09
#> 
#> 

IpsoExtra(fitted(lm(y1 ~ x)), x, nRep = 2, resScale = 0.1)  # resScale no effect since perfect fit
#>   [,1] [,2]
#> 1  214  214
#> 2  281  281
#> 3  348  348
#> 4  415  415
#> 5  482  482
IpsoExtra(fitted(lm(y1 ~ x)), x, nRep = 2, resScale = 0.1, rmse = 2)  # with warning
#> Warning: rmse used instead of resScal since perfect fit.
#>       [,1]     [,2]
#> 1 215.2762 212.9805
#> 2 280.2639 281.6310
#> 3 347.7186 347.8100
#> 4 412.6665 417.5651
#> 5 484.0749 480.0134

# Using data in the paper
IpsoExtra(RegSDCdata("sec7y"), RegSDCdata("sec7x"))  # Similar to Y*
#>                freq
#> row1_col1 -1.097376
#> row2_col1 10.752402
#> row3_col1 12.000000
#> row4_col1 12.344973
#> row1_col2 15.097376
#> row2_col2  4.902624
#> row3_col2 22.000000
#> row4_col2 19.000000
#> row1_col3 32.000000
#> row2_col3  7.344973
#> row3_col3  7.655027
#> row4_col3 16.000000
#> row1_col4 30.000000
#> row2_col4  8.000000
#> row3_col4 -3.655027
#> row4_col4  8.655027
IpsoExtra(RegSDCdata("sec7y"), RegSDCdata("sec7x"), rmse = 1)
#>                freq
#> row1_col1  4.423115
#> row2_col1  3.772169
#> row3_col1 12.000000
#> row4_col1 13.804716
#> row1_col2  9.576885
#> row2_col2 10.423115
#> row3_col2 22.000000
#> row4_col2 19.000000
#> row1_col3 32.000000
#> row2_col3  8.804716
#> row3_col3  6.195284
#> row4_col3 16.000000
#> row1_col4 30.000000
#> row2_col4  8.000000
#> row3_col4 -2.195284
#> row4_col4  7.195284