- Support for key lookups returning a subset of a table's columns with fewer
lookup restrictions via
RecordRetriever.getColumnsByKey()
- Support for key lookups returning records from a specified offset
- Failback to a primary cluster after failing over to a secondary cluster
- Upgraded Avro library to 1.11.4
- Error message for bad URLs with auto-discovery disabled
- Potential resource leaks upon connection errors
- Issue with lookup up server version when auto-discovery is disabled
- Modified server version extractor to handle all 5 version components
- Made the full set of system properties for the active cluster available
- Modified POM for publishing to Maven Central Repository
- Upgraded Jackson core library to 2.17.1
- Downgraded Logback library to 1.3.14
- Upgraded SLF4j library to 2.0.13
- Issue with dependent JDBC
fullshaded
driver terminating with no linked Snappy library - JavaDoc generation warnings
- Check CHANGELOG-FUNCTIONS.md for endpoint related changes.
- OAuth2 authentication support
- Publishing to Maven Central Repository
- Snappy error for
fullshaded
JDBC JAR
- Out-of-memory error when downloading large files
- Check CHANGELOG-FUNCTIONS.md for endpoint related changes.
- Lowered default server connection timeout to 5 seconds
- Made server connection timeout (user-specified or default) govern connection timeouts in all cases of initially connecting to a server
- Deprecated
isKineticaRunning()
in favor ofisSystemRunning()
- Check CHANGELOG-FUNCTIONS.md for endpoint related changes.
- Support for unsigned long types and null values in arrays
- Concurrency issue with the use of
BulkInserter.insert(List)
- Check CHANGELOG-FUNCTIONS.md for endpoint related changes.
- Increased connection timeout from ~1 to 20 seconds to account for connections over high-traffic and public networks
- Upgraded Snappy library from 1.1.10.4 to 1.1.10.5
- Check CHANGELOG-FUNCTIONS.md for endpoint related changes.
- Upgraded Apache HTTPClient5 library from 5.3 to 5.3.1
- Support for Array, JSON and Vector data
query()
&execute()
methods for more easily running SQL statements
- Check CHANGELOG-FUNCTIONS.md for endpoint related changes.
- Snappy error for
fullshaded
JDBC JAR
- Publishing to Maven Central Repository
- Several security-related dependency updates
- Out-of-memory error when downloading large files
- Check CHANGELOG-FUNCTIONS.md for endpoint related changes.
- Lowered default server connection timeout to 5 seconds
- Made server connection timeout (user-specified or default) govern connection timeouts in all cases of initially connecting to a server
- Deprecated
isKineticaRunning()
in favor ofisSystemRunning()
- Check CHANGELOG-FUNCTIONS.md for endpoint related changes.
- Concurrency issue with the use of
BulkInserter.insert(List)
- Increased connection timeout from ~1 to 20 seconds to account for connections over high-traffic and public networks
- Upgraded Snappy library from 1.1.10.4 to 1.1.10.5
- Check CHANGELOG-FUNCTIONS.md for endpoint related changes.
- Upgraded Apache HTTPClient5 library from 5.3 to 5.3.1
- Aligned read timeouts in cloud & non-cloud environments
- Upgraded Logback library from 1.2.10 to 1.2.13
GPUdbSqlIterator
class for easily looping over the records of a SQL result set.
- Upgraded Avro library from 1.11.1 to 1.11.3
- Upgraded Snappy library from 1.1.10.1 to 1.1.10.4
- Upgraded JSON library from 20230227 to 20231013
- Auto-disabled Snappy compression, if not available on the host system
- Deadlock in multi-head ingest
- Bug in thread-safety during multi-head ingest HA failover
- Bug in multi-head ingest HA failover
- Bug in head node JSON ingest
- Bug in file download API
getRecordsJson()
method & overloads for direct egress of data as JSON stringsinsertRecordsFromJson()
method overloadsBulkInserter
constructor overloads
- Error handling for JSON support in
BulkInserter
- Support for large file downloads
- Upgraded Snappy library from 1.1.8.4 to 1.1.10.1
- Support for ULONG column type
- Support for UUID column type & primary key
- Upgraded JSON library from 20220924 to 20230227
- Support for routing through the head node all disabled multi-head operations
- Multi-head ingestion support for JSON-formatted data
- Support for HA failover when user-specified connection URLs don't match the server-known URLs; multi-head operations will still be disabled
- Removed N+1 features & references
- Examples of secure/unsecure connections; improved SSL failure error message
- Fixed handling of timeout units
- Fixed application of SSL handshake timeout to read requests
- Improved error logging format
- Check CHANGELOG-FUNCTIONS.md for endpoint related changes.
- JSON support in
BulkInserter
- Upgraded Apache HTTP client library from 4.5.13 to 5.2.1
- Fixed OOM errors during large file uploads
- Improved handling of non-retryable
BulkInserter
errors - Improved usability of file handler example
- Unneeded class references in logging output
- Retry handler for
org.apache.http.conn.ConnectionTimeoutException
errors
- Fixed cluster hostname matching check
- Support for boolean type
insertRecordsFromJson()
method for direct ingest of data as JSON strings
- Improved reporting of SSL certificate errors
- Updated Jackson databind version to 2.14.1
- Improved reporting of permissions errors during
BulkInserter
insert
- Table permissions check on
BulkInserter
instantiation
- Updated Jackson databind version to 2.14.0
- Stopped/suspended clusters are reported as such
getWarnings()
method onBulkInserter
BulkInserter.getErrors()
now only returns errors and not warnings -- usegetWarnings()
for warnings
- Improved performance of
BulkInserter
&RecordRetriever
key generation
- Retry handler for all requests
- Logging during timed flush executions
- Reduced idle connection check interval to 100ms
- Improved reporting of
BulkInserter
errors
- Reduced logging during setting of timed flush
- Simplified dependency inclusion
- Support for boolean type
BulkInserter
will do a default of 3 local retries for any non-data failure before attempting to fail over to another cluster- Timed flush mechanism can be set or reset after
BulkInserter
construction - Fully relocated dependencies to avoid library conflicts
- Java bytecode version mismatch when compiling on Java9+ and running on Java8
- Added logging during timed flush executions
- Reduced logging during setting of timed flush
- BulkInserter will do a default of 3 local reries for any non-data failure before attempting to fail over to another cluster
- Timed flush mechanism can be set or reset after BulkInserter construction
- No failover will be attempted if only one cluster is found
- Targeted Java 8 runtime
- Updated Avro version to 1.11.1
- Method to query a
BulkInserter
multi-head status
- Switched ordering of flush & thread shutdown sequence
- Disabled client intra-cluster failover if failover is disabled on the server
- Improved error reporting of environment-related issues during inserts
- Capability to pass in self-signed certificates & passwords as options
- Updated dependencies to more secure versions
- Fixed unhandled exception when an Avro encoding error occurs
- Fixed error where a type's column order would not match a table created from it
- Removed client-side primary key check, to improve performance and make returned errors more consistently delivered
- Logging changed from using Log4j 1.x to SLF4J with a default Logback stdout
logger
- The breaking change was due to security issues with both Log4j 1.x and 2.x and SLF4J was chosen since it is the successor to Log4j and more flexible
- See the README.md for directions on using a SLF4J logger and configuring an alternative backend to Logback
- Note that it is possible to continue to use Log4j 1.x through SLF4J, though not recommended
- Removed
GPUdb.Options.getLoggingLevel()
&GPUdb.Options.setLoggingLevel()
; instead, set thecom.gpudb
log level with a user-suppliedlogback.xml
resource or with the newly added staticGPUdbLogger.setLoggingLevel()
, which can be called at any time
- Option for automatically flushing the BulkInserter at a given delay
- Added support for retrieving errant records from
BulkInserter
ingest - Added support for automatically flushing the
BulkInserter
and cleaning up of service objects uponBulkInserter
shutdown - Improved SSL cert bypass
- Minor KiFS API usage improvements
- Check CHANGELOG-FUNCTIONS.md for endpoint related changes.
- Introduced a new API for facilitating uploading and downloading of files
to and from the KIFS. The class encapsulating the API is
GPUdbFileHandler
. A complete example has been given in thegpudb-api-example
project in the classGPUdbFileHandlerExample
. - Introduced the capability to upload Parquet files.
- Check CHANGELOG-FUNCTIONS.md for endpoint related changes.
- Added option in
GPUdbBase
class to pass in customSSLConnectionSocketFactory
to facilitate passing in a user supplied truststore file along with the password
- Check CHANGELOG-FUNCTIONS.md for endpoint related changes.
- Class
GPUdb.GPUdbVersion
that represents the Kinetica server's version (the one the API is connected to). - Method
GPUdb.getServerVersion()
.
- Added support for multi-head key lookup for replicated tables.
- Updated the following dependency package versions to eliminate known
security risks and other issues:
- org.apache.avro 1.8.1 -> 1.10.1
- commons-codec: 1.10 -> 1.13
- httpclient: 4.5.11 -> 4.5.13
- maven-shade-plugin: 2.1 -> 3.2.4
- Due to the dependency updates, applications using this API may
start getting a warning log from SLF4J saying:
Failed to load class org.slf4j.impl.StaticLoggerBinder
This is an innocuous warning. Please see the README file for more details.
- An issue with
BulkInserter
flush when retryCount > 0
- Converted the
BulkInserter
flushing mechanism from single-threaded to parallel-threaded.
- Check CHANGELOG-FUNCTIONS.md for endpoint related changes.
- Check CHANGELOG-FUNCTIONS.md for endpoint related changes.
- Another
ping()
method that takes in a timeout as a second parameter.
GPUdb
constructor behavior such that if the server at the user given IP address responds with public IPs that the client application environment cannot acces, theGPUdb
object will be created with the user given IP addresses, completely disregarding the public addresses given by the server. The side-effect of this that the API's failover mechanism must be disabled; this is logged as a warning.
- Support for intra-cluster, also known as N+1, failover.
- Support for logging.
GPUdb.Options
options (solely handled by getters and setters):- clusterReconnectCount -- The number of times the API tries to reconnect to the same cluster (when a failover event has been triggered), before actually failing over to any available backup cluster. Does not apply when only a single cluster is available. Default is 1.
- disableFailover -- Indicates whether to disable failover upon failures (both high availability--or inter-cluster--failover and N+1--or intra-cluster--failover). Default false.
- disableAutoDiscovery -- Indicates whether to disable automatic discovery of backup clusters or worker rank URLs. If set to true, then the GPUdb object will not connect to the database at initialization time, and will only work with the URLs given. Default is false.
- haFailoverOrder -- The order of choosing backup clusters in the event of high availability failover. Default is GPUdb.HAFailoverOrder.RANDOM.
- hostnameRegex -- A regular expression to apply to all automatically discovered URLs for the Kinetica servers. No default.
- initialConnectionAttemptTimeout -- The timeout used when trying to establish a connection to the database at GPUdb initialization. The value is given in milliseconds. The default is 0, which prevents any retry and stores the user given URLs as is.
- intraClusterFailoverRetryCount -- The number of times the API tries to recover during an intra-cluster (N+1) failover scenario. This positive integer determines how many times all known ranks will be queried before giving up (in the first of two stages of recovery process). The default is 3.
- intraClusterFailoverTimeout -- The amount of time the API tries to recover during an intra-cluster (N+1) failover scenario. Given in milliseconds. Default is 0 (infinite). This time interval spans both stages of the N+1 failover recovery process.
- loggingLevel -- The logging level to use for the API. By default, logging is turned off. If logging properties are set up by the user (via log4j.properties etc.), then that will be honored only if the default logging level is used. Otherwise, the programmatically set level will be used.
- Added class
GPUdb.ClusterAddressInfo
which contains information about a given Kinetica cluster, including rank URLs and hostnames. GPUdb
methods:getHARingInfo()
getHARingSize()
getPrimaryHostName()
BulkInserter
default retry count to 1 (from 0).
GPUdb.setHostManagerPort(int)
method. The user must set the host manager atGPUdb
initialization; changing the host manager port will not be permitted post-initialization. The method is now a no-op (until removed in 7.2 or a later version).
- Upgraded Avro library from 1.11.1 to 1.11.3
- Upgraded Snappy library from 1.1.1.3 to 1.1.10.4
- Upgraded Logback library from 1.2.10 to 1.2.13
- Improved reporting of permissions errors during
BulkInserter
insert
- Updated Avro version to 1.11.1
- Improved error reporting of environment-related issues during inserts
- Logger initialization issue
- Issue with thread over-accumulation when inserting data
- Updated the following dependency package version to eliminate known
security risks and other issues:
- org.apache.avro 1.10.1 -> 1.11.0
- Class
GPUdb.GPUdbVersion
that represents the Kinetica server's version (the one the API is connected to). - Method
GPUdb.getServerVersion()
.
- Added support for multi-head key lookup for replicated tables.
- Updated the following dependency package versions to eliminate known
security risks and other issues:
- org.apache.avro 1.8.1 -> 1.10.1
- commons-codec: 1.10 -> 1.13
- httpclient: 4.5.11 -> 4.5.13
- maven-shade-plugin: 2.1 -> 3.2.4
- Due to the dependency updates, applications using this API may
start getting a warning log from SLF4J saying:
Failed to load class org.slf4j.impl.StaticLoggerBinder
This is an innocuous warning. Please see the README file for more details.
- An issue with
BulkInserter
flush when retryCount > 0
- Converted the
BulkInserter
flushing mechanism from single-threaded to parallel-threaded.
- Check CHANGELOG-FUNCTIONS.md for endpoint related changes.
- Class
GPUdb.GPUdbVersion
that represents the Kinetica server's version (the one the API is connected to). - Method
GPUdb.getServerVersion()
.
- Added support for multi-head key lookup for replicated tables.
- An issue with
BulkInserter
flush when retryCount > 0
- Converted the
BulkInserter
flushing mechanism from single-threaded to parallel-threaded.
- Check CHANGELOG-FUNCTIONS.md for endpoint related changes.
GPUdb.Options
memberconnectionInactivityValidationTimeout
which controls the period of inactivity after which a connection would be checked for inactivity or stale-ness before leasing to a client. The value is given in milliseconds. The default value is 200 ms. Note that this is for fine-tuning the connection manager, and should be used with deep understanding of how connections are managed. The default value would likely suffice for most users; we're just letting the user have the control, if they want it.
- The default value of
GPUdb.Options
memberserverConnectionTimeout
to10000
(equivalent to 10 seconds). - The default value of
GPUdb.Options
membermaxConnectionsPerHost
to10
.
- Check CHANGELOG-FUNCTIONS.md for endpoint related changes.
GPUdb.Options
memberserverConnectionTimeout
which controls the server connection timeout (not the request timeout). The value is given in milliseconds. The default value is 3 seconds (3000).
- Check CHANGELOG-FUNCTIONS.md for endpoint related changes.
- Check CHANGELOG-FUNCTIONS.md for endpoint related changes.
- Socket connection timeout--now check IP/hostname availability for 1 second (instead of applying the user given timeout--default infinite--which resulted in a few minutes of hanging for bad addresses).
- Set host manager endpoint retry count to 3 (not configurable) so that the API does not go into an infinite loop for a bad user given host manager port.
- Check CHANGELOG-FUNCTIONS.md for endpoint related changes.
- Check CHANGELOG-FUNCTIONS.md for endpoint related changes.
- Check CHANGELOG-FUNCTIONS.md for endpoint related changes.
- Options for configuring the maximum allowed number of connections:
GPUdb.Options.maxTotalConnections
(across all hosts; default 40)GPUdb.Options.maxConnectionsPerHost
(for any given host; default 40)
- Improved connection throughput over SSL.
- Check CHANGELOG-FUNCTIONS.md for endpoint related changes.
- Support for high-availability failover when the database is in the offline mode.
GPUdb
constructor behavior--if a single URL is used and no primary URL is specified via the options, the given single URL will be treated as the primary URL.
- Multi-head insertion high-availability failover issue when retryCount > 0
- Multi-head I/O high-availability failover thread-safety issues
- Multi-head I/O high-availability failover issue when a worker rank dies.
- An option to
GPUdb.Options
for bypassing SSL certificate verification for HTTPS connections. Obtained by and set byOptions.getBypassSslCertCheck()
andOptions.setBypassSslCertCheck(boolean)
methods.
- Support for adding and removing custom headers to the
GPUdb
object. See methods:GPUdb.addHttpHeader(String, String)
GPUdb.removeHttpHeader(String)
- Support for new column property
ulong
to multi-head I/O. Compatible with Kinetica Server version 7.0.7.0 and later only.
- A stack overflow bug in an edge case of high availability failover for multi-head ingestion.
- Kinetica 7.0.7.0 and later
- Added support for high availability failover when the system is limited (in addition to connection problems). Compatible with Kinetica Server version 7.0.6.2 and later only.
- Kinetica 7.0.6.2 and later
- Support for passing
/get/records
options toRecordRetriever
; can be set via the constructors and also be set by the setter method. - Support for overriding the high availability synchronicity mode for
endpoints; set the mode (enum
HASynchronicityMode
) with the setter methodsetHASyncMode()
:DEFAULT
SYNCHRONOUS
ASYNCRHONOUS
- Enumerations,
Type.Column.ColumnType
andType.Column.ColumnBaseType
, to indicate a column's type. Use gettersType.Column.getColumnType()
andType.Column.getColumnBaseType()
to obtain the appropriate enumeration. This is more efficient than checking for strings in the column's property list or checking for Java class equivalency.
- Error message format when endpoint submission fails altogether (whether no connection can be made or if the database returns some error).
- A
putDateTime
method toGenericRecord
that parses string values with a variety of different date, time, and datetime formats and converts them to the appropriate Kinetica format for the column's type. Of the accepteble formats, the date component can be any of YMD, MDY, or DMY pattern with '-', '.', or '/' as the separator. And, the time component (optional for both date and datetime, but required for time) must have hours and minutes, but can optionally have seconds, fraction of a second (up to six digits) and some form of a timezone identifier.
- Minor documentation and some options for some endpoints
- Parameters for
/visualize/isoschrone
- Support for high availability (HA) to multi-head ingestion and retrieval
- Error messages to include the original error message when Kinetica is unavailable and other available HA ring clusters have been tried (and failed).
- Added support for selecting a primary host for the
GPUdb
class
- Added missing types for
Type.fromDynamicSchema()
: --datetime
--geometry
(mapped towkt
) - Added method
hasProperty(String)
toType.Column
; provides a convenient functionality to check if a given column property applies to the given column.
- Added support for comma-separated URLs for the
GPUdb
constructor that takes a string.
- Added a new column property:
INIT_WITH_NOW
- Added support for high availability (HA) failover logic to the
GPUdb
class
- Added support for cluster reconfiguration to the multi-head I/O operations
- An option to
GPUdb.Options
for bypassing SSL certificate verification for HTTPS connections. Obtained by and set byOptions.getBypassSslCertCheck()
andOptions.setBypassSslCertCheck(boolean)
methods.
- Support for overriding the high availability synchronicity mode for
endpoints; set the mode (enum
HASynchronicityMode
) with the setter methodsetHASyncMode()
:DEFAULT
SYNCHRONOUS
ASYNCRHONOUS
- Enumerations,
Type.Column.ColumnType
andType.Column.ColumnBaseType
, to indicate a column's type. Use gettersType.Column.getColumnType()
andType.Column.getColumnBaseType()
to obtain the appropriate enumeration. This is more efficient than checking for strings in the column's property list or checking for Java class equivalency.
- A
putDateTime
method toGenericRecord
that parses string values with a variety of different date, time, and datetime formats and converts them to the appropriate Kinetica format for the column's type. Of the accepteble formats, the date component can be any of YMD, MDY, or DMY pattern with '-', '.', or '/' as the separator. And, the time component (optional for both date and datetime, but required for time) must have hours and minutes, but can optionally have seconds, fraction of a second (up to six digits) and some form of a timezone identifier.
- Added avro shading to the package
- Added missing types for
Type.fromDynamicSchema()
: --datetime
--geometry
(mapped towkt
) - Added method
hasProperty(String)
toType.Column
; provides a convenient functionality to check if a given column property applies to the given column.
- New
RecordRetriever
class to support multi-head record lookup by shard key BulkInserter.WorkerList
class deprecated in favor of top-levelWorkerList
class used by bothBulkInserter
andRecordRetriever
- Added support for host manager endpoints
- Added member dataType to the response protocol classes that return
a dynamically generated table. Currently, that includes:
AggregateGroupByResponse
AggregateUniqueResponse
AggregateUnpivotResponse
GetRecordsByColumnResponse
- Improved request submission logic to be faster and use less memory
- Version release
- Version release
- Record objects now support complex column names (expressions, multipart join names, etc.)
- Record objects now support access via a Map interface via
getDataMap()
- Can now pass arbitrary additional HTTP headers to GPUdb
- Added nullable column support
- Updated documentation generation
- Refactor generation of the APIs
- Added an example package