Skip to content
Snippets Groups Projects
Commit 453a9e18 authored by Keyser, Johannes's avatar Keyser, Johannes
Browse files

Add R² (coeff of determination) to output

parent b91d8ae4
Branches
Tags 1.3.0.0
No related merge requests found
...@@ -48,6 +48,7 @@ Note how a few "unlucky" outliers can bias the least squares estimate (LS), but ...@@ -48,6 +48,7 @@ Note how a few "unlucky" outliers can bias the least squares estimate (LS), but
- October 2014 by Z. Danziger: Original version. - October 2014 by Z. Danziger: Original version.
- September 2015 by Z. Danziger: Updated help, speed increase for 2D case - September 2015 by Z. Danziger: Updated help, speed increase for 2D case
- March 2022 by J. Keyser: Adjusted formatting, added documentation, improved example and added plot, replaced `nanmedian(X)` with `median(X, 'omitnan')`, removed 2D special case, restructured input and output parameters. - March 2022 by J. Keyser: Adjusted formatting, added documentation, improved example and added plot, replaced `nanmedian(X)` with `median(X, 'omitnan')`, removed 2D special case, restructured input and output parameters.
- August 2022 by J. Keyser: Explicit omission of identical x coordinates, added coefficient of determination (unadjusted R²) as optional output.
## Contributing and project status ## Contributing and project status
......
function coef = TheilSen(X, y) function [coef, rsqrd] = TheilSen(X, y)
% THEILSEN performs Theil-Sen robust simple linear regression(s) of X on y. % THEILSEN performs Theil-Sen robust simple linear regression(s) of X on y.
% %
% For convenience, the input X may contain more than one predictor variable; % For convenience, the input X may contain more than one predictor variable;
...@@ -18,13 +18,14 @@ function coef = TheilSen(X, y) ...@@ -18,13 +18,14 @@ function coef = TheilSen(X, y)
% y: A column vector containing the observations of the response variable. % y: A column vector containing the observations of the response variable.
% %
% OUTPUT % OUTPUT
% coef: Estimated regression coefficients for each predictor column in X, with % coef: Estimated regression coefficients for each predictor column in X,
% respect to the response variable y. Each column in coef corresponds % with respect to the response variable y. Each column in coef
% to one predictor in X, i.e. it will have as many columns as X. % corresponds to one predictor in X, i.e. it has as many columns as X.
% The first row, i.e. coef(1, :), contains the estimated offset(s). % The first row, i.e. coef(1, :), contains the estimated offset(s).
% The second row, i.e. coef(2, :), contains the estimated slope(s). % The second row, i.e. coef(2, :), contains the estimated slope(s).
% (This output format was chosen to avoid confusion, e.g. with previous % (This output format was chosen to avoid confusion, e.g. with previous
% versions of this code.) % versions of this code.)
% rsqrd: Ordinary R² (coefficient of determination) per predictor column in X.
% %
% EXAMPLE % EXAMPLE
% See accompanying file example.m. % See accompanying file example.m.
...@@ -109,4 +110,17 @@ if any(isnan(coef)) ...@@ -109,4 +110,17 @@ if any(isnan(coef))
warning('TheilSen:NaNoutput', ... warning('TheilSen:NaNoutput', ...
'Output contains NaN; check for degenerate inputs.') 'Output contains NaN; check for degenerate inputs.')
end end
% If requested, output the ordinary/unadjusted R², per predictor column in X.
if nargout > 1
% compute total sum of squares relative to the mean
sum_sq_total = sum(power(y - mean(y), 2));
sums_sq_total = repmat(sum_sq_total, 1, Num_Pred);
% compute the residual sum(s) of squares relative to the regression line(s)
ys_model = b0s + b1s .* X;
ys_data = repmat(y, 1, Num_Pred);
sums_sq_resid = sum(power(ys_data - ys_model, 2));
% compute R² as 1 - SSresid / SStotal (i.e., the unadjusted version of R²)
rsqrd = 1 - sums_sq_resid ./ sums_sq_total;
end
end end
...@@ -30,7 +30,7 @@ data_y(outlr_idx) = outlr_y; ...@@ -30,7 +30,7 @@ data_y(outlr_idx) = outlr_y;
est_ls = [ones(N_total, 1), data_x] \ data_y; est_ls = [ones(N_total, 1), data_x] \ data_y;
% Estimate Theil-Sen parameters. % Estimate Theil-Sen parameters.
est_ts = TheilSen(data_x, data_y); [est_ts, ~] = TheilSen(data_x, data_y);
% Plot everything and add comparison of estimates to title. % Plot everything and add comparison of estimates to title.
figure() figure()
...@@ -52,9 +52,9 @@ data_x1 = data_x; ...@@ -52,9 +52,9 @@ data_x1 = data_x;
data_x2 = +10 - 2 * data_x1 + randn(size(data_x1)) * SDx_usual; data_x2 = +10 - 2 * data_x1 + randn(size(data_x1)) * SDx_usual;
data_x3 = -50 + 5 * data_x1 + randn(size(data_x1)) * SDx_usual; data_x3 = -50 + 5 * data_x1 + randn(size(data_x1)) * SDx_usual;
X = [data_x1, data_x2, data_x3]; X = [data_x1, data_x2, data_x3];
est_ts = TheilSen(X, data_y); [est_ts, r_sqrd] = TheilSen(X, data_y);
% plot both simple regressions; note mainly the difference between x-axes % plot all simple regressions; note mainly the difference between x-axes
num_pred = size(X, 2); num_pred = size(X, 2);
figure() figure()
for pp = 1:num_pred for pp = 1:num_pred
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment