Skip to content

Commit

Permalink
Merge remote-tracking branch 'origin/rel-3.46.0' into gh-15810
Browse files Browse the repository at this point in the history
  • Loading branch information
krasinski committed Oct 6, 2024
2 parents d130a65 + 733c496 commit 2cb3105
Show file tree
Hide file tree
Showing 80 changed files with 2,904 additions and 1,653 deletions.
54 changes: 54 additions & 0 deletions Changes.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,60 @@

## H2O

### 3.46.0.5 - 8/28/2024

Download at: <a href='http://h2o-release.s3.amazonaws.com/h2o/rel-3.46.0/5/index.html'>http://h2o-release.s3.amazonaws.com/h2o/rel-3.46.0/5/index.html</a>

#### Bug
- [[#16328]](https://github.com/h2oai/h2o-3/issues/16328) - Updated how ModelSelection handles categorical predictors to preserve the best categorical predictor when the best categorical level performs well relative to other predictors.
- [[#16120]](https://github.com/h2oai/h2o-3/issues/16120) - Resolved that MOJO is working for Isolation Forest and Extended Isolation forest for implemented versions.

#### New Feature
- [[#16327]](https://github.com/h2oai/h2o-3/issues/16327) - Ensured H2O-3 can load data from Snowflake using JDBC connector.

#### Docs
- [[#16215]](https://github.com/h2oai/h2o-3/issues/16215) - Updated the following user guide pages to adhere to style guide updates: Algorithms, Supported data types, Quantiles, and Early stopping.
- [[#16207]](https://github.com/h2oai/h2o-3/issues/16207) - Updated the Starting H2O user guide page to adhere to style guide updates.
- [[#15989]](https://github.com/h2oai/h2o-3/issues/15989) - Updated Python documentation for Decision Tree algorithm.

#### Security
- [[#16349]](https://github.com/h2oai/h2o-3/issues/16349) - Addressed sonatype-2024-0171 by upgrading jackson-databind to 2.17.2.
- [[#16342]](https://github.com/h2oai/h2o-3/issues/16342) - Addressed SNYK-JAVA-DNSJAVA-7547403, SNYK-JAVA-DNSJAVA-7547404, SNYK-JAVA-DNSJAVA-7547405, and CVE-2024-25638 by upgrading dnsjava to 3.6.0.

### 3.46.0.4 - 7/9/2024

Download at: <a href='http://h2o-release.s3.amazonaws.com/h2o/rel-3.46.0/4/index.html'>http://h2o-release.s3.amazonaws.com/h2o/rel-3.46.0/4/index.html</a>

#### Docs
- [[#16212]](https://github.com/h2oai/h2o-3/issues/16212) - Updating user guide - H2O Clients.
- [[#16214]](https://github.com/h2oai/h2o-3/issues/16214) - Updating user guide - Data Manipulation.
- [[#16213]](https://github.com/h2oai/h2o-3/issues/16213) - Updating user guide - Getting data into your H2O cluster.

#### Security
- [[#15748]](https://github.com/h2oai/h2o-3/issues/15748) - Addressed PRISMA-2023-0067 by upgrading jackson-databind.

### 3.46.0.3 - 6/11/2024

Download at: <a href='http://h2o-release.s3.amazonaws.com/h2o/rel-3.46.0/3/index.html'>http://h2o-release.s3.amazonaws.com/h2o/rel-3.46.0/3/index.html</a>

#### Bug Fix
- [[#16274]](https://github.com/h2oai/h2o-3/issues/16274) - Fixed plotting for H2O Explainabilty by resolving issue in the matplotlib wrapper.
- [[#16192]](https://github.com/h2oai/h2o-3/issues/16192) - Fixed `h2o.findSynonyms` failing if the `word` parameter is unknown to the Word2Vec model.
- [[#15947]](https://github.com/h2oai/h2o-3/issues/15947) - Fixed `skipped_columns` error caused by mismatch during the call to `parse_setup` when constructing an `H2OFrame`.

#### Improvement
- [[#16278]](https://github.com/h2oai/h2o-3/issues/16278) - Added flag to enable `use_multi_thread` automatically when using `as_data_frame`.

#### New Feature
- [[#16284]](https://github.com/h2oai/h2o-3/issues/16284) - Added support for Websockets to steam.jar.

#### Docs
- [[#16189]](https://github.com/h2oai/h2o-3/issues/16189) - Updating user guide - Downloading & Installing H2O.
- [[#16288]](https://github.com/h2oai/h2o-3/issues/16288) - Fixed GBM Python example in user guide.
- [[#16188]](https://github.com/h2oai/h2o-3/issues/16188) - Updated API-related changes page to adhere to style guide requirements.
- [[#16016]](https://github.com/h2oai/h2o-3/issues/16016) - Added examples to Python documentation for Uplift DRF.
- [[#15988]](https://github.com/h2oai/h2o-3/issues/15988) - Added examples to Python documentation for Isotonic Regression.

### 3.46.0.2 - 5/13/2024

Download at: <a href='http://h2o-release.s3.amazonaws.com/h2o/rel-3.46.0/2/index.html'>http://h2o-release.s3.amazonaws.com/h2o/rel-3.46.0/2/index.html</a>
Expand Down
2 changes: 0 additions & 2 deletions docker/prisma/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,4 @@ RUN for dir in $DIRECTORIES; do \
chown -R 2117:2117 /$dir; \
done

RUN npm install snyk -g

CMD ["/bin/bash"]
6 changes: 4 additions & 2 deletions gradle/s3sync.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -103,9 +103,11 @@ def syncData(subdir) {

println("Going to download ${downloadList.size} files...")
downloadList.each {
def fpath = 'https://h2o-public-test-data.s3.amazonaws.com/'+it.Key
def key = it.Key.toString().replaceAll(" ", "+") // handle filenames with spaces
def fpath = 'https://h2o-public-test-data.s3.amazonaws.com/'+key
def localDestPath = localDestDir + it.Key.text().substring(trimLength).replaceAll("/", Matcher.quoteReplacement(File.separator))
try {
try {
println "Downloading " + fpath
download {
src fpath
dest localDestPath
Expand Down
5 changes: 2 additions & 3 deletions h2o-algos/src/main/java/hex/glm/ComputationState.java
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,6 @@ public final class ComputationState {
int _totalBetaLength; // actual coefficient length without taking into account active columns only
int _betaLengthPerClass;
public boolean _noReg;
public boolean _hasConstraints;
public ConstrainedGLMUtils.ConstraintGLMStates _csGLMState;

public ComputationState(Job job, GLMParameters parms, DataInfo dinfo, BetaConstraint bc, GLM.BetaInfo bi){
Expand Down Expand Up @@ -1414,7 +1413,7 @@ protected GramXY computeNewGram(DataInfo activeData, double [] beta, GLMParamete
double obj_reg = _parms._obj_reg;
if(_glmw == null) _glmw = new GLMModel.GLMWeightsFun(_parms);
GLMTask.GLMIterationTask gt = new GLMTask.GLMIterationTask(_job._key, activeData, _glmw, beta,
_activeClass, _hasConstraints).doAll(activeData._adaptedFrame);
_activeClass).doAll(activeData._adaptedFrame);
gt._gram.mul(obj_reg);
if (_parms._glmType.equals(GLMParameters.GLMType.gam)) { // add contribution from GAM smoothness factor
Integer[] activeCols=null;
Expand Down Expand Up @@ -1463,7 +1462,7 @@ protected GramGrad computeGram(double [] beta, GLMGradientInfo gradientInfo){
double obj_reg = _parms._obj_reg;
if(_glmw == null) _glmw = new GLMModel.GLMWeightsFun(_parms);
GLMTask.GLMIterationTask gt = new GLMTask.GLMIterationTask(_job._key, activeData, _glmw, beta,
_activeClass, _hasConstraints).doAll(activeData._adaptedFrame);
_activeClass).doAll(activeData._adaptedFrame);
double[][] fullGram = gt._gram.getXX(); // only extract gram matrix
mult(fullGram, obj_reg);
if (_gramEqual != null)
Expand Down
75 changes: 45 additions & 30 deletions h2o-algos/src/main/java/hex/glm/ConstrainedGLMUtils.java
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ public static class ConstraintGLMStates {
double _ckCS;
double _ckCSHalf; // = ck/2
double _epsilonkCS;
double _epsilonkCSSquare;
public double _epsilonkCSSquare;
double _etakCS;
double _etakCSSquare;
double _epsilon0;
Expand Down Expand Up @@ -137,30 +137,35 @@ public static int[] extractBetaConstraints(ComputationState state, String[] coef
List<LinearConstraints> equalityC = new ArrayList<>();
List<LinearConstraints> lessThanEqualToC = new ArrayList<>();
List<Integer> betaIndexOnOff = new ArrayList<>();
if (betaC._betaLB != null) {
int numCons = betaC._betaLB.length-1;
for (int index=0; index<numCons; index++) {
if (!Double.isInfinite(betaC._betaUB[index]) && (betaC._betaLB[index] == betaC._betaUB[index])) { // equality constraint
addBCEqualityConstraint(equalityC, betaC, coefNames, index);
betaIndexOnOff.add(1);
} else if (!Double.isInfinite(betaC._betaUB[index]) && !Double.isInfinite(betaC._betaLB[index]) &&
(betaC._betaLB[index] < betaC._betaUB[index])) { // low < beta < high, generate two lessThanEqualTo constraints
addBCGreaterThanConstraint(lessThanEqualToC, betaC, coefNames, index);
addBCLessThanConstraint(lessThanEqualToC, betaC, coefNames, index);
betaIndexOnOff.add(1);
betaIndexOnOff.add(0);
} else if (Double.isInfinite(betaC._betaUB[index]) && !Double.isInfinite(betaC._betaLB[index])) { // low < beta < infinity
addBCGreaterThanConstraint(lessThanEqualToC, betaC, coefNames, index);
betaIndexOnOff.add(1);
} else if (!Double.isInfinite(betaC._betaUB[index]) && Double.isInfinite(betaC._betaLB[index])) { // -infinity < beta < high
addBCLessThanConstraint(lessThanEqualToC, betaC, coefNames, index);
betaIndexOnOff.add(1);
}
boolean bothEndsPresent = (betaC._betaUB != null) && (betaC._betaLB != null);
boolean lowerEndPresentOnly = (betaC._betaUB == null) && (betaC._betaLB != null);
boolean upperEndPresentOnly = (betaC._betaUB != null) && (betaC._betaLB == null);

int numCons = betaC._betaLB != null ? betaC._betaLB.length - 1 : betaC._betaUB.length - 1;
for (int index = 0; index < numCons; index++) {
if (bothEndsPresent && !Double.isInfinite(betaC._betaUB[index]) && (betaC._betaLB[index] == betaC._betaUB[index])) { // equality constraint
addBCEqualityConstraint(equalityC, betaC, coefNames, index);
betaIndexOnOff.add(1);
} else if (bothEndsPresent && !Double.isInfinite(betaC._betaUB[index]) && !Double.isInfinite(betaC._betaLB[index]) &&
(betaC._betaLB[index] < betaC._betaUB[index])) { // low < beta < high, generate two lessThanEqualTo constraints
addBCGreaterThanConstraint(lessThanEqualToC, betaC, coefNames, index);
addBCLessThanConstraint(lessThanEqualToC, betaC, coefNames, index);
betaIndexOnOff.add(1);
betaIndexOnOff.add(0);
} else if ((lowerEndPresentOnly || (betaC._betaUB != null && Double.isInfinite(betaC._betaUB[index]))) &&
betaC._betaLB != null && !Double.isInfinite(betaC._betaLB[index])) { // low < beta < infinity
addBCGreaterThanConstraint(lessThanEqualToC, betaC, coefNames, index);
betaIndexOnOff.add(1);
} else if ((upperEndPresentOnly || (betaC._betaLB != null && Double.isInfinite(betaC._betaLB[index]))) &&
betaC._betaUB != null && !Double.isInfinite(betaC._betaUB[index])) { // -infinity < beta < high
addBCLessThanConstraint(lessThanEqualToC, betaC, coefNames, index);
betaIndexOnOff.add(1);
}
}

state.setLinearConstraints(equalityC.toArray(new LinearConstraints[0]),
lessThanEqualToC.toArray(new LinearConstraints[0]), true);
return betaIndexOnOff.size()==0 ? null : betaIndexOnOff.stream().mapToInt(x->x).toArray();
return betaIndexOnOff.size() == 0 ? null : betaIndexOnOff.stream().mapToInt(x -> x).toArray();
}

/***
Expand Down Expand Up @@ -506,11 +511,10 @@ public static boolean extractCoeffNames(List<String> coeffList, LinearConstraint

public static List<String> foundRedundantConstraints(ComputationState state, final double[][] initConstraintMatrix) {
Matrix constMatrix = new Matrix(initConstraintMatrix);
Matrix constMatrixLessConstant = constMatrix.getMatrix(0, constMatrix.getRowDimension() -1, 1, constMatrix.getColumnDimension()-1);
Matrix constMatrixTConstMatrix = constMatrixLessConstant.times(constMatrixLessConstant.transpose());
int rank = constMatrixLessConstant.rank();
Matrix matrixSquare = constMatrix.times(constMatrix.transpose());
int rank = matrixSquare.rank();
if (rank < constMatrix.getRowDimension()) { // redundant constraints are specified
double[][] rMatVal = constMatrixTConstMatrix.qr().getR().getArray();
double[][] rMatVal = matrixSquare.qr().getR().getArray();
List<Double> diag = IntStream.range(0, rMatVal.length).mapToDouble(x->Math.abs(rMatVal[x][x])).boxed().collect(Collectors.toList());
int[] sortedIndices = IntStream.range(0, diag.size()).boxed().sorted((i, j) -> diag.get(i).compareTo(diag.get(j))).mapToInt(ele->ele).toArray();
List<Integer> duplicatedEleIndice = IntStream.range(0, diag.size()-rank).map(x -> sortedIndices[x]).boxed().collect(Collectors.toList());
Expand Down Expand Up @@ -645,6 +649,16 @@ public static void genInitialLambda(Random randObj, LinearConstraints[] constrai
}
}

public static void adjustLambda(LinearConstraints[] constraints, double[] lambda) {
int numC = constraints.length;
LinearConstraints oneC;
for (int index=0; index<numC; index++) {
oneC = constraints[index];
if (!oneC._active)
lambda[index]=0.0;
}
}


public static double[][] sumGramConstribution(ConstraintsGram[] gramConstraints, int numCoefs) {
if (gramConstraints == null)
Expand Down Expand Up @@ -714,16 +728,16 @@ public static void updateConstraintParameters(ComputationState state, double[] l
LinearConstraints[] equalConst, LinearConstraints[] lessThanConst,
GLMModel.GLMParameters parms) {
// calculate ||h(beta)|| square, ||gradient|| square
double hBetaMag = state._csGLMState._constraintMagSquare;
if (hBetaMag <= state._csGLMState._etakCSSquare) { // implement line 26 to line 29 of Algorithm 19.1
double hBetaMagSquare = state._csGLMState._constraintMagSquare;
if (hBetaMagSquare <= state._csGLMState._etakCSSquare) { // implement line 26 to line 29 of Algorithm 19.1
if (equalConst != null)
updateLambda(lambdaEqual, state._csGLMState._ckCS, equalConst);
if (lessThanConst != null)
updateLambda(lambdaLessThan, state._csGLMState._ckCS, lessThanConst);
state._csGLMState._epsilonkCS = state._csGLMState._epsilonkCS/state._csGLMState._ckCS;
state ._csGLMState._etakCS = state._csGLMState._etakCS/Math.pow(state._csGLMState._ckCS, parms._constraint_beta);
} else { // implement line 31 to 34 of Algorithm 19.1
state._csGLMState._ckCS = state._csGLMState._ckCS*parms._constraint_tau;
state._csGLMState._ckCS = state._csGLMState._ckCS*parms._constraint_tau; // tau belongs to [4,10]
state._csGLMState._ckCSHalf = state._csGLMState._ckCS*0.5;
state._csGLMState._epsilonkCS = state._csGLMState._epsilon0/state._csGLMState._ckCS;
state._csGLMState._etakCS = parms._constraint_eta0/Math.pow(state._csGLMState._ckCS, parms._constraint_alpha);
Expand Down Expand Up @@ -813,13 +827,14 @@ public static void updateConstraintValues(double[] betaCnd, List<String> coefNam
if (equalityConstraints != null) // equality constraints
Arrays.stream(equalityConstraints).forEach(constraint -> {
evalOneConstraint(constraint, betaCnd, coefNames);
constraint._active = (Math.abs(constraint._constraintsVal) > EPS2);
// constraint._active = (Math.abs(constraint._constraintsVal) > EPS2);
constraint._active = true;
});

if (lessThanEqualToConstraints != null) // less than or equal to constraints
Arrays.stream(lessThanEqualToConstraints).forEach(constraint -> {
evalOneConstraint(constraint, betaCnd, coefNames);
constraint._active = constraint._constraintsVal > 0;
constraint._active = constraint._constraintsVal >= 0;
});
}

Expand Down
Loading

0 comments on commit 2cb3105

Please sign in to comment.