Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The calculation of Ridge Regression is not coming in correctly. #342

Open
sr13579 opened this issue Jul 30, 2024 · 0 comments
Open

The calculation of Ridge Regression is not coming in correctly. #342

sr13579 opened this issue Jul 30, 2024 · 0 comments

Comments

@sr13579
Copy link

sr13579 commented Jul 30, 2024

Please take a look at this code(This is the Ridge Class):

<?php

namespace Rubix\ML\Regressors;

use Tensor\Matrix;
use Tensor\Vector;
use Rubix\ML\Learner;
use Rubix\ML\DataType;
use Rubix\ML\Estimator;
use Rubix\ML\Persistable;
use Rubix\ML\RanksFeatures;
use Rubix\ML\EstimatorType;
use Rubix\ML\Helpers\Params;
use Rubix\ML\Datasets\Dataset;
use Rubix\ML\Traits\AutotrackRevisions;
use Rubix\ML\Specifications\DatasetIsLabeled;
use Rubix\ML\Specifications\DatasetIsNotEmpty;
use Rubix\ML\Specifications\SpecificationChain;
use Rubix\ML\Specifications\DatasetHasDimensionality;
use Rubix\ML\Specifications\LabelsAreCompatibleWithLearner;
use Rubix\ML\Specifications\SamplesAreCompatibleWithEstimator;
use Rubix\ML\Exceptions\InvalidArgumentException;
use Rubix\ML\Exceptions\RuntimeException;

use function is_null;

/**
 * Ridge
 *
 * L2 regularized least squares linear model solved using a closed-form solution. The addition
 * of regularization, controlled by the *l2Penalty* parameter, makes Ridge less prone to overfitting
 * than ordinary linear regression.
 *
 * @category    Machine Learning
 * @package     Rubix/ML
 * @author      Andrew DalPino
 */
class Ridge implements Estimator, Learner, RanksFeatures, Persistable
{
    use AutotrackRevisions;

    /**
     * The strength of the L2 regularization penalty.
     *
     * @var float
     */
    protected float $l2Penalty;

    /**
     * The y intercept i.e. the bias added to the decision function.
     *
     * @var float|null
     */
    protected ?float $bias = null;

    /**
     * The computed coefficients of the regression line.
     *
     * @var Vector|null
     */
    protected ?Vector $coefficients = null;

    /**
     * @param float $l2Penalty
     * @throws InvalidArgumentException
     */
    public function __construct(float $l2Penalty = 1.0)
    {
        if ($l2Penalty < 0.0) {
            throw new InvalidArgumentException('L2 Penalty must be'
                . " greater than 0, $l2Penalty given.");
        }

        $this->l2Penalty = $l2Penalty;
    }

    /**
     * Return the estimator type.
     *
     * @internal
     *
     * @return EstimatorType
     */
    public function type() : EstimatorType
    {
        return EstimatorType::regressor();
    }

    /**
     * Return the data types that the estimator is compatible with.
     *
     * @internal
     *
     * @return list<\Rubix\ML\DataType>
     */
    public function compatibility() : array
    {
        return [
            DataType::continuous(),
        ];
    }

    /**
     * Return the settings of the hyper-parameters in an associative array.
     *
     * @internal
     *
     * @return mixed[]
     */
    public function params() : array
    {
        return [
            'l2 penalty' => $this->l2Penalty,
        ];
    }

    /**
     * Has the learner been trained?
     *
     * @return bool
     */
    public function trained() : bool
    {
        return $this->coefficients and isset($this->bias);
    }

    /**
     * Return the weights of features in the decision function.
     *
     * @return (int|float)[]|null
     */
    public function coefficients() : ?array
    {
        return $this->coefficients ? $this->coefficients->asArray() : null;
    }

    /**
     * Return the bias added to the decision function.
     *
     * @return float|null
     */
    public function bias() : ?float
    {
        return $this->bias;
    }

    /**
     * Train the learner with a dataset.
     *
     * @param \Rubix\ML\Datasets\Labeled $dataset
     */
    public function train(Dataset $dataset) : void
    {
        SpecificationChain::with([
            new DatasetIsLabeled($dataset),
            new DatasetIsNotEmpty($dataset),
            new SamplesAreCompatibleWithEstimator($dataset, $this),
            new LabelsAreCompatibleWithLearner($dataset, $this),
        ])->check();

        $biases = Matrix::ones($dataset->numSamples(), 1);

        $x = Matrix::build($dataset->samples())->augmentLeft($biases);
        $y = Vector::build($dataset->labels());

        /** @var int<0,max> $nHat */
        $nHat = $x->n() - 1;

        $penalties = array_fill(0, $nHat, $this->l2Penalty);
        
        array_unshift($penalties, 0.0);

        $penalties = Matrix::diagonal($penalties);


        $xT = $x->transpose();
                print_r($penalties);

        print_r($xT->matmul($x)->add($penalties)->inverse());

        $coefficients = $xT->matmul($x)
            ->add($penalties)
            ->inverse()
            ->dot($xT->dot($y))
            ->asArray();

        $this->bias = (float) array_shift($coefficients);
        $this->coefficients = Vector::quick($coefficients);
    }

    /**
     * Make a prediction based on the line calculated from the training data.
     *
     * @param Dataset $dataset
     * @throws RuntimeException
     * @return list<int|float>
     */
    public function predict(Dataset $dataset) : array
    {
        if (!$this->coefficients or is_null($this->bias)) {
            throw new RuntimeException('Estimator has not been trained.');
        }

        DatasetHasDimensionality::with($dataset, count($this->coefficients))->check();

        return Matrix::build($dataset->samples())
            ->dot($this->coefficients)
            ->add($this->bias)
            ->asArray();
    }

    /**
     * Return the importance scores of each feature column of the training set.
     *
     * @throws RuntimeException
     * @return float[]
     */
    public function featureImportances() : array
    {
        if (is_null($this->coefficients)) {
            throw new RuntimeException('Learner has not been trained.');
        }

        return $this->coefficients->abs()->asArray();
    }

    /**
     * Return the string representation of the object.
     *
     * @internal
     *
     * @return string
     */
    public function __toString() : string
    {
        return 'Ridge (' . Params::stringify($this->params()) . ')';
    }
}

The inverse function of this code is not giving me an expected result. for this matrix

$x=[
[1,2],
[2,3],
[3,4],
[4,5]
];
$y=[5,7,9,11];

The expected output should be
```
4.1287603545996221412 2.7612265659061182893 -3.110013079494259548
2.7612265659061182888 3.7509082982124691088 -3.4878651358814125766
-3.1100130794942595477 -3.4878651358814125767 3.4021217846243278509

But I am getting this instead:

` Array ( [0] => Array ( [0] => 7.0321782178217 [1] => 4.7029702970296 [2] => -5.2970297029702 ) [1] => Array ( [0] => 4.7029702970296 [1] => 5.0495049504949 [2] => -4.9504950495048 ) [2] => Array ( [0] => -5.2970297029702 [1] => -4.9504950495048 [2] => 5.0495049504949 ) )
`

As you can see the values don't match. I tried to follow the ridge math to solve this. But this one is not working. Am I missing something???
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant