Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prototype for storing single-cell data #1020

Draft
wants to merge 298 commits into
base: development
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
298 commits
Select commit Hold shift + click to select a range
454128e
Make environment profies mutually exclusive
arteymix Feb 29, 2024
c81d8ef
cli: Make batch tasks simple Runnable
arteymix Apr 8, 2024
b2085c7
cli: Fix parsing and declaration of numerical parameters
arteymix Mar 26, 2024
79c763e
cli: Allow strings to be passed as verbosity levels
arteymix Apr 9, 2024
8d7ae53
cli: When a single batch task is submitted, run it in the main thread
arteymix Apr 10, 2024
f66034c
cli: Add long options for batch format and batch output file
arteymix Apr 11, 2024
8a1ea4f
Clearly identify threads used by various thread pools
arteymix Apr 12, 2024
bd09fcc
Add various strategies to discovering single-cell data from GEO
arteymix Feb 29, 2024
d346a83
Warn for incomplete 10X MEX submissions
arteymix Apr 12, 2024
eb05eb1
Mark HibernateConfigTest as slow
arteymix Apr 12, 2024
58ccd51
Handle AnnData and Seurat Disk supplementary with the .h5 extension
arteymix Apr 12, 2024
b7a7b21
Generic metadata reader
arteymix Apr 15, 2024
379b066
Merge remote-tracking branch 'origin/development' into feature-single…
arteymix Jun 12, 2024
684c53a
Merge remote-tracking branch 'origin/development' into feature-single…
arteymix Jun 12, 2024
7b57de0
Fix tests and address some merging issues
arteymix Jun 12, 2024
256b1dc
Update completion scripts
arteymix Jun 12, 2024
ae8d609
Remove single-cell vectors in batch
arteymix Jun 13, 2024
73320a8
Merge remote-tracking branch 'origin/development' into feature-single…
arteymix Jun 26, 2024
2501453
Merge remote-tracking branch 'origin/development' into feature-single…
arteymix Jun 26, 2024
4d9cd5c
Move java.library.path outside jvmOptions
arteymix Jun 26, 2024
8ac010f
Merge remote-tracking branch 'origin/development' into feature-single…
arteymix Jun 26, 2024
0118739
Merge remote-tracking branch 'origin/development' into feature-single…
arteymix Jun 27, 2024
7ba2976
Improve logic for detecting bulk RNA-Seq samples
arteymix Jul 3, 2024
4ae3744
Update pavlab-starter-parent to 1.2.14
arteymix Jul 4, 2024
e64cf2b
Make Python executable location configurable and check if necessary p…
arteymix Jul 4, 2024
177eef2
Merge branch 'development' into HEAD
arteymix Aug 7, 2024
3bbf76c
Allow single cell detector to discover additional supplementary mater…
arteymix Aug 17, 2024
a6fe1ae
Ignore and warn supplementary file with NONE value
arteymix Aug 17, 2024
a768f54
Add a test for NONE in supplementary materials
arteymix Aug 17, 2024
0622e22
Enumerate filenames when multiple AnnData or SeuratDisk formats are f…
arteymix Aug 17, 2024
c34e185
Use a CSVPrinter for generating the summary output file
arteymix Aug 17, 2024
f17ee7d
Don't detect additional supplementary material in non-single cell sam…
arteymix Aug 17, 2024
f82781d
Improve retry strategy and skip TAR with large and unwanted entries
arteymix Aug 18, 2024
abf5d4d
Include full URL in FTP error messages
arteymix Aug 19, 2024
ce308dc
Use awaitTermination() instead of sleep() when parsing GEO SOFT files
arteymix Aug 19, 2024
7398c1b
Add a null-check for usages of FTPClient.mlistFile()
arteymix Aug 19, 2024
a77226b
Add a value to the --retry option
arteymix Aug 19, 2024
da4c4d0
Filter _RAW.tar before looking its content up
arteymix Aug 19, 2024
23e73d4
Allow for contextualized supplementary materials
arteymix Aug 20, 2024
d518e6f
Indicate progress when downloading single-cell files
arteymix Aug 20, 2024
06ef9b7
cli: Use a shutdown hook for closing the application context
arteymix Aug 20, 2024
8226b4f
Fix progress recording when reading a single byte
arteymix Aug 22, 2024
cab07bf
Improve FTP client factor and ensure that the client is always destro…
arteymix Aug 22, 2024
7d9f85f
Inline H5A_iterate_t and H%l_iterate_opdata_t to prevent bytecode sca…
arteymix Aug 22, 2024
329ec6c
Revert pretty time to 5.0.6
arteymix Aug 7, 2024
f2cc034
Use H5AreadVL instead of H5Aread_VLStrings for compatibility with HDF…
arteymix Aug 22, 2024
56337ec
Use system-wide installation of HDF5 by default
arteymix Aug 22, 2024
151ecc6
Set the number of FTP connection to match the number of fetch threads
arteymix Aug 22, 2024
80daed1
Add long options for downloading single-cell data
arteymix Aug 22, 2024
1eb753e
Avoid deadlock when checking the size of a FTP remote file
arteymix Aug 22, 2024
e02ecf3
Include supplementary files for unsupported datasets
arteymix Aug 26, 2024
adaf8e0
Numerous improvements
arteymix Aug 28, 2024
ca8fd4f
Add a 100% progress report when hitting EOF
arteymix Aug 28, 2024
03603e0
Add useful comment about HDF5 in pom.xml
arteymix Aug 28, 2024
f6acab6
Make sure that additional files and data types are retrieved for fail…
arteymix Aug 28, 2024
3866b17
Externalize configuration and authentication from FTPClientFactory
arteymix Aug 23, 2024
967c03f
Further improvements for downlading single-cell data
arteymix Aug 29, 2024
5a377b5
Update command line completion scripts
arteymix Aug 30, 2024
b63035e
WIP
arteymix Jun 28, 2024
4010cb0
Check userinfo before the FTPClient is retrieved
arteymix Aug 30, 2024
75450cf
Fix regression in FactorType.toString()
arteymix Aug 30, 2024
6c997fd
Make SimpleRetry reusable by making 'what' a paramtere of execute()
arteymix Sep 1, 2024
bc33eb0
Fix selection of subset of AnnData files
arteymix Sep 4, 2024
092324a
Add missing Loom detection in isSingleCell()
arteymix Sep 4, 2024
a44d9db
Numerous improvements for loading AnnData formats
arteymix Sep 5, 2024
fbb2d2d
Add a package for supporting AnnData formats and simplifying the load…
arteymix Sep 5, 2024
12a34b0
Fix javadoc error in GeoSingleCellDetector
arteymix Sep 9, 2024
37f68fc
Add a CLI to generate a database migration update
arteymix Sep 13, 2024
f29fdf8
Add a migration script for the single-cell data support
arteymix Sep 16, 2024
1dac9a3
Cleanup index naming scheme
arteymix Sep 16, 2024
e9a74ab
Add CLIs for loading single-cell data
arteymix Sep 9, 2024
ecee349
Merge branch 'feature-single-cell-biomaterial-hierarchy' into feature…
arteymix Sep 16, 2024
0fc9590
Update the 1.32.0 migration and improve fk constraint names
arteymix Sep 16, 2024
4d6706f
Add a flag to generate a creation script to generateDatabaseUpdate CLI
arteymix Sep 16, 2024
31b6831
Fix syntax error and unused javadoc tags
arteymix Sep 16, 2024
fb9abfc
Fix various Javadocs warning
arteymix Sep 16, 2024
22792a3
fixup! Fix syntax error and unused javadoc tags
arteymix Sep 16, 2024
42ceb04
Fix non-unique fk and index name for scd-bad relation
arteymix Sep 17, 2024
97a4f00
Merge branch 'development' into feature-single-cell
arteymix Sep 18, 2024
d852f92
Add a CI option for force deployment on the development server
arteymix Sep 18, 2024
6534840
Use false as default for IS_SINGLE_CELL_PREFERRED
arteymix Sep 18, 2024
ef2598c
Add SparseArrayList and SparseRangeArrayList
arteymix Sep 18, 2024
a57a42e
Remove unused index on CORRECTED_P_VALUE_BIN
arteymix Sep 18, 2024
1710fe5
Add generic cell-level characteristics for #1218
arteymix Sep 25, 2024
86aa0fe
Fix SingleCellDimensionTest
arteymix Sep 25, 2024
a65a47e
Add a method for retrieving cell-level characteristics for a given ca…
arteymix Sep 25, 2024
ec6093d
Ignore cell types and characteristics validation if not initialized
arteymix Sep 25, 2024
0bd34ce
Merge branch 'development' into feature-single-cell
arteymix Sep 27, 2024
62806c7
Merge branch 'development' into feature-single-cell
arteymix Sep 27, 2024
d50a696
Introduce BaseTest to set the test profile
arteymix Sep 30, 2024
2e8f32f
Make elementClass private and logging class-dependent in AbstractDao
arteymix Sep 30, 2024
02ddeec
Use ByteArrayType to map bytes to doubles automatically
arteymix Oct 1, 2024
2778aea
Replace all usages of ByteArrayConverter with ByteArrayUtils
arteymix Oct 1, 2024
b1f5b84
Use baseCode 1.1.24-SNAPSHOT
arteymix Oct 1, 2024
f17fd9f
Merge branch 'development' into HEAD
arteymix Oct 1, 2024
28866ef
Fix binCounts type in AnalysisResultSetsWebServiceTest
arteymix Oct 1, 2024
37345df
Merge branch 'development' into HEAD
arteymix Oct 2, 2024
eb6369b
Fix duplicated BIO_ASSAYS_FKC index
arteymix Oct 3, 2024
b32f928
Add a helpful warning message when credentials environment variables …
arteymix Oct 2, 2024
6e79f58
Few more improvements for EE-manipulating CLIs
arteymix Oct 2, 2024
bd5b12d
cli: Filter stacktraces printed in the console
arteymix Oct 2, 2024
db79e0b
Make sure that the ED is created when modifying it in EE service
arteymix Oct 2, 2024
4b36d9b
Relocate EntityUtils in IdentifiableUtils and remove/inline unused me…
arteymix Oct 2, 2024
bd06431
Extract auto-seeking capabilities in a subclass of AbstractCLI
arteymix Oct 7, 2024
7ac28ba
Update @Secured annotation in ExpressionExperimentService
arteymix Oct 7, 2024
bf88c28
Make CompositeSequence.biologicalCharacteristics lazy by default
arteymix Oct 4, 2024
c6ac65b
Single-cell work
arteymix Oct 7, 2024
2d9fdf8
cli: Generate completion for Path
arteymix Oct 8, 2024
82d5fe0
cli: Allow file completions for Path options and add more keywords
arteymix Oct 8, 2024
1a13b25
Merge branch 'development' into feature-single-cell
arteymix Oct 9, 2024
c329780
Revert audit events to consolidate in a separate commit
arteymix Oct 10, 2024
ac920aa
Add audit event types for single-cell data
arteymix Oct 10, 2024
bdf4ee8
cli: Match supplementary file by last component in downloadSingleCell…
arteymix Oct 10, 2024
7c22a76
Switch single-cell experiments to the target platform when adding or …
arteymix Oct 15, 2024
b439401
Initial implementation of MexSingleCellMatrixWriter
arteymix Oct 11, 2024
b4223ad
More improvements for EE-manipulating CLIs
arteymix Oct 16, 2024
06af32b
Add missing @Secured annotation to ArrayDesignService.getTaxa
arteymix Oct 16, 2024
4db8190
Add a few logger for the CLI
arteymix Oct 16, 2024
5f21d17
Fix -force option not being honored in AbstractAutoSeekingCLI
arteymix Oct 16, 2024
8511ff6
More improvements to ExpressionDataMatrixWriter
arteymix Oct 16, 2024
d7860eb
rest: Resolve expression data files from disk and resort to streaming…
arteymix Oct 17, 2024
c604572
Ensure that files are deleted if an error occurs while writing them
arteymix Oct 17, 2024
3523087
Make sure that biological characteristics are thawed when thawing vec…
arteymix Oct 17, 2024
a106ba1
Fix output filename for data file with specific QT
arteymix Oct 17, 2024
b8670b3
fixup! rest: Resolve expression data files from disk and resort to st…
arteymix Oct 17, 2024
2d338a4
rest: Make it possible to mark a payload as already compressed (fix #…
arteymix Oct 17, 2024
5c30a2d
rest: Make sure that result sets are UTF-8 encoded
arteymix Oct 17, 2024
3b58b88
fixup! More improvements to ExpressionDataMatrixWriter
arteymix Oct 17, 2024
834d39a
Add basic RW locking for reading and writing files
arteymix Oct 18, 2024
4c35779
Make data file service non-transactional
arteymix Oct 18, 2024
23c50ce
Move logic requiring a transaction in a helper service
arteymix Oct 18, 2024
0ac4f56
Revert "Add a few logger for the CLI"
arteymix Oct 18, 2024
e8637e3
Adjust logging-levels for certain debug outputs
arteymix Oct 18, 2024
a08b085
Fix extension checking for Path objects
arteymix Oct 18, 2024
f7bb3e8
Only populate cs2gene if there are vectors and add some logging
arteymix Oct 18, 2024
a3853d5
Update completion scripts
arteymix Oct 18, 2024
7fe9add
rest: Await a few seconds for files to be generated
arteymix Oct 18, 2024
9065f30
Avoid formatting strings in getSparseArrayElement() and getSparseRang…
arteymix Oct 18, 2024
33af162
Retain original design elements when loading from MEX and AnnData
arteymix Oct 18, 2024
e5967a6
Add support for iterator and close() over columns and dataframe
arteymix Oct 17, 2024
e352679
rest: Fix media type for 503 error on getDatasetSingleCellExpression()
arteymix Oct 18, 2024
5f0a1d2
rest: Add missing data.txt.gz resource for tests and fix other tests
arteymix Oct 18, 2024
5b3dab5
Merge branch 'development' into HEAD
arteymix Oct 18, 2024
2f45823
MORE WORK
arteymix Oct 10, 2024
76617bf
Create MEX files for preferred vectors, remove old files when vectors…
arteymix Oct 18, 2024
ed82937
Use cleanForFileName on BAs names
arteymix Oct 18, 2024
0049fee
ci: Add missing FORCE_SLOW_TESTS and FORCE_DEPLOY to the possible con…
arteymix Oct 18, 2024
9446a11
Do not interrupt threads when cancelling futures to avoid closing Luc…
arteymix Oct 18, 2024
370c832
Mark GemmaRestApiClientTest as an integration test
arteymix Oct 18, 2024
eb312f6
Improve tabular output for single-cell data
arteymix Oct 19, 2024
c9b28f8
Relocate QT conversion and detection in distinct sub-packages
arteymix Oct 20, 2024
1238f48
Improve pre-processing excption hierarchy
arteymix Oct 20, 2024
0634a13
Few more improvements for metadata file management
arteymix Oct 21, 2024
4d1b5ca
Make sure that the write lock acquisition includes retrieving the data
arteymix Oct 21, 2024
a5badce
rest: Add support for downloading MEX single-cell data
arteymix Oct 22, 2024
6d0309f
Add missing AbstractGzipHeaderDecorator and OpenApiGzipHeaderDecorator
arteymix Oct 22, 2024
18fb23e
Tag datasets with 'single-cell RNA sequencing' when single-cell data …
arteymix Oct 22, 2024
87aad71
rest: Add an example for a tabular single-cell dataset
arteymix Oct 23, 2024
4ba128d
Add support for ISO 8601 date format in tabular outputs
arteymix Oct 23, 2024
b008488
Include build information in generated TSV files (fix #1042)
arteymix Oct 23, 2024
daa94af
fixup! Include build information in generated TSV files (fix #1042)
arteymix Oct 24, 2024
d105386
Add a CLI for aggregating single-cell data
arteymix Oct 24, 2024
2f61b28
Update CLI completion scripts
arteymix Oct 24, 2024
4690dda
cli: Suppress org.springframework.cglib in stacktraces
arteymix Oct 24, 2024
3c35789
cli: Add buildExperimentOptions() and processExperimentOptions()
arteymix Oct 24, 2024
3073ccb
Fix tests and make experimentUrl optional
arteymix Oct 24, 2024
e11f7ff
WIP
arteymix Oct 25, 2024
701c968
Improve EntityUrl API
arteymix Oct 25, 2024
a6b2e9d
Fix tests [WIP]
arteymix Oct 22, 2024
6a9a952
Merge branch 'development' into feature-single-cell
arteymix Oct 25, 2024
bd4761a
Unify SVDService and SVDHelperService
arteymix Oct 25, 2024
e004667
rest: Add a force parameter to force file regeneration
arteymix Oct 25, 2024
0772261
Fix EntityUrlTest
arteymix Oct 25, 2024
a7c068b
Make sure that streams are closed when writing tabular single-cell data
arteymix Oct 25, 2024
20d8912
rest: Update generated examples
arteymix Oct 25, 2024
15cf939
Fix missing mocks in tests
arteymix Oct 25, 2024
f7c18ba
Fix LuceneQueryUtilsTest
arteymix Oct 25, 2024
27ddfb9
Update commons-compress to 1.27.1
arteymix Oct 25, 2024
ed8a9f2
Relocate swagger-annotations dependency in gemma-core
arteymix Oct 25, 2024
1f25798
Exclude protobuf-java from mysql-connector-j dependency
arteymix Oct 25, 2024
384a8e9
Fix findByExpressionExperiment() not restricting on EE
arteymix Oct 25, 2024
2d49a5c
Replace AnchorTagUtil with EntityUrlBuilder
arteymix Oct 26, 2024
5fc2507
Use EntityUrlBuilder for all redirection URLs
arteymix Oct 26, 2024
1c23433
cli: Fix all CLI tests
arteymix Oct 26, 2024
7a98c94
Add WebEntityUrlBuilder and RestEntityUrlBuilder
arteymix Oct 26, 2024
7a010d5
Make URLs generated by OntologyController are relative to the context…
arteymix Oct 27, 2024
cb8680f
Allow generating URLs for FactorValue and Characteristic
arteymix Oct 27, 2024
6cb296e
Move URL builders component declarations to the XML config
arteymix Oct 27, 2024
1b84c99
rest: Fix test assuming TSV is the default, MEX is
arteymix Oct 27, 2024
2479d6d
rest: Few more improvements for locked files
arteymix Oct 28, 2024
c137776
Add a CLI for detecting QT from data
arteymix Oct 28, 2024
680ef93
Make sure that a converted lock is not released twice
arteymix Oct 28, 2024
c6fa792
rest: Add examples for design and MEX outputs
arteymix Oct 28, 2024
e2dc7f7
Add a .mex suffix to MEX-structured directories
arteymix Oct 28, 2024
7471c00
Update completion scripts
arteymix Oct 28, 2024
dd9e0f2
Add the necessary fields and logic for storing single-cell sparsity m…
arteymix Oct 29, 2024
96abedc
Improve script for updating REST API docs examples
arteymix Oct 29, 2024
46e6d3b
cli: Print a link to the experiment and add more checks when setting …
arteymix Oct 29, 2024
e7dd185
Add a dev script for updating the test database
arteymix Oct 29, 2024
4ed5901
Add a dev script to deploy the CLI
arteymix Oct 29, 2024
ca1b313
rest: Update REST docs examples
arteymix Oct 30, 2024
368f64a
Ignore .envrc and Jupyter notebook checkpoints
arteymix Oct 30, 2024
a1d98f8
Remove all .csvignore files
arteymix Oct 30, 2024
9aaa67c
rest: Allow the MexMatrixBundler to pre-calculate the size of te archive
arteymix Oct 30, 2024
0b2d93e
rest: Add missing MEX test files for DatasetsWebServiceTest
arteymix Oct 30, 2024
497cec8
Fix SingleCellExpressionExperimentAggregatorServiceTest
arteymix Oct 29, 2024
2639c38
Always regenerate the version file before launch or deploying
arteymix Oct 30, 2024
098d2b6
Fix behavior for log1p and log-scale data (fix #1274)
arteymix Oct 31, 2024
e848c1b
Add methods for obtaining the columns of a SingleCellExpressionDataMa…
arteymix Nov 1, 2024
5a73241
Update maven-site-plugin to 3.21
arteymix Nov 1, 2024
df47515
Add support for detecting log-transformed counts
arteymix Nov 4, 2024
f555487
Update jackson to 2.18.1
arteymix Nov 4, 2024
38c1b97
Update plugins
arteymix Nov 4, 2024
0a1bf4b
Implement log2cpm computation for aggregating count data
arteymix Nov 5, 2024
a493b2a
Various single-cell work
arteymix Nov 5, 2024
7a083e9
Add a task executor to limit the nmber of tasks that generate data files
arteymix Nov 3, 2024
9e43425
Fix aggregation for last sample and implement sparsity metrics
arteymix Nov 5, 2024
289f31e
Fix tests and two minor bugs in EE service
arteymix Nov 6, 2024
385bbd2
Few more cleanups for DataVector usage and remove unused primitive types
arteymix Nov 6, 2024
7f57646
Fix detection of single-cell data and inclusion of samples in GeoConv…
arteymix Nov 6, 2024
b67f3c4
Only cache skipped archive if we're looking for MEX data
arteymix Nov 6, 2024
06c5a33
Fix MatrixConversionTest using setDataAsDoubles() before setting a QT
arteymix Nov 6, 2024
53a82d0
More cleanups
arteymix Nov 6, 2024
29b2ea2
Re-enable incremental compilation
arteymix Nov 7, 2024
3165185
Improve elements mapping (fix #1234)
arteymix Nov 7, 2024
24277fd
Fix lazy initialization error when checking a dataset accession
arteymix Nov 7, 2024
fedf83a
Remove empty RestEntityUrl file
arteymix Nov 7, 2024
9ae8892
Don't execute deployment script in a terminal
arteymix Nov 7, 2024
c729d0e
Fix more tests
arteymix Nov 7, 2024
1b2470e
Fix exceeding packet size for MySQL by streaming cell IDs
arteymix Nov 7, 2024
8f6f50b
Replace the pipe with a pair if piped streams
arteymix Nov 8, 2024
cc9f32e
Consider the scale type when computing sparsity metrics
arteymix Nov 8, 2024
076e6e1
Improve handling of layered AnnData files
arteymix Nov 8, 2024
9d4b947
Numerious cleanups for Gemma Web [WIP]
arteymix Nov 8, 2024
e3e85c4
Replace JAWR with Webpack
arteymix Nov 8, 2024
f087c19
Merge branch 'feature-single-cell-webpack' into feature-single-cell-u…
arteymix Nov 15, 2024
1e38dee
Set the JSP trim whitespace directive in web.xml
arteymix Nov 15, 2024
89aa08b
Few improvements for the static assets server
arteymix Nov 15, 2024
de6bff5
Add IntelliJ scripts for building and serving static assets
arteymix Nov 15, 2024
7bf620e
Few more improvements for the subset page
arteymix Nov 15, 2024
5461ae6
Fix source mappings in web.xml
arteymix Nov 15, 2024
266dd4d
Fix EE set page
arteymix Nov 15, 2024
39fe57c
Fix visibility of sprintf and visualizeDiffExpressionHandler
arteymix Nov 15, 2024
f45ef66
Merge pull request #1288 from PavlidisLab/feature-single-cell-ui-work
arteymix Nov 15, 2024
016f899
Fix the lineplot and flotr2
arteymix Nov 15, 2024
490771a
Move the JavascriptLogger into the gemma-lib bundle since it requires…
arteymix Nov 15, 2024
99e6320
Fix incorrect assertion in ExperimentalDesignWriter
arteymix Nov 15, 2024
5f8e207
Generate DWR client code
arteymix Nov 18, 2024
96fdd61
More frontend cleanups
arteymix Nov 19, 2024
85485be
Allow edition of single-cell QTs
arteymix Nov 21, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@
import ubic.gemma.model.expression.bioAssayData.BioAssayDimension;
import ubic.gemma.model.expression.bioAssayData.DesignElementDataVector;
import ubic.gemma.model.expression.bioAssayData.ProcessedExpressionDataVector;
import ubic.gemma.model.expression.bioAssayData.BulkExpressionDataVector;
import ubic.gemma.model.expression.designElement.CompositeSequence;
import ubic.gemma.model.expression.experiment.ExpressionExperiment;
import ubic.gemma.persistence.util.ChannelUtils;
Expand All @@ -50,7 +51,7 @@ public class ExpressionDataMatrixBuilder {
private static final Log log = LogFactory.getLog( ExpressionDataMatrixBuilder.class.getName() );
private final Map<ArrayDesign, BioAssayDimension> dimMap = new HashMap<>();
private final Map<QuantitationType, Integer> numMissingValues = new HashMap<>();
private Collection<DesignElementDataVector> vectors;
private Collection<BulkExpressionDataVector> vectors;
private ExpressionExperiment expressionExperiment;
private Collection<ProcessedExpressionDataVector> processedDataVectors = new HashSet<>();
private QuantitationTypeData dat = null;
Expand All @@ -59,7 +60,7 @@ public class ExpressionDataMatrixBuilder {
/**
* @param vectors collection of vectors. They should be thawed first.
*/
public ExpressionDataMatrixBuilder( Collection<? extends DesignElementDataVector> vectors ) {
public ExpressionDataMatrixBuilder( Collection<? extends BulkExpressionDataVector> vectors ) {
if ( vectors == null || vectors.size() == 0 )
throw new IllegalArgumentException( "No vectors" );
this.vectors = new HashSet<>();
Expand All @@ -75,7 +76,7 @@ public ExpressionDataMatrixBuilder( Collection<? extends DesignElementDataVector
}

public ExpressionDataMatrixBuilder( Collection<ProcessedExpressionDataVector> processedVectors,
Collection<? extends DesignElementDataVector> otherVectors ) {
Collection<? extends BulkExpressionDataVector> otherVectors ) {
this.vectors = new HashSet<>();
this.vectors.addAll( otherVectors );
this.processedDataVectors = processedVectors;
Expand All @@ -87,7 +88,7 @@ public ExpressionDataMatrixBuilder( Collection<ProcessedExpressionDataVector> pr
* @param vectors raw vectors
* @return matrix of appropriate type.
*/
public static ExpressionDataMatrix<?> getMatrix( Collection<? extends DesignElementDataVector> vectors ) {
public static ExpressionDataMatrix<?> getMatrix( Collection<? extends BulkExpressionDataVector> vectors ) {
if ( vectors == null || vectors.isEmpty() )
throw new IllegalArgumentException( "No vectors" );
PrimitiveType representation = vectors.iterator().next().getQuantitationType().getRepresentation();
Expand All @@ -100,7 +101,7 @@ public static ExpressionDataMatrix<?> getMatrix( Collection<? extends DesignElem
* @return matrix of appropriate type.
*/
private static ExpressionDataMatrix<?> getMatrix( PrimitiveType representation,
Collection<? extends DesignElementDataVector> vectors ) {
Collection<? extends BulkExpressionDataVector> vectors ) {
ExpressionDataMatrix<?> expressionDataMatrix;
if ( representation.equals( PrimitiveType.DOUBLE ) ) {
expressionDataMatrix = new ExpressionDataDoubleMatrix( vectors );
Expand Down Expand Up @@ -291,7 +292,7 @@ public List<BioAssayDimension> getBioAssayDimensions() {

ExpressionDataMatrixBuilder.log.debug( "Checking all vectors to get bioAssayDimensions" );
Collection<BioAssayDimension> dimensions = new HashSet<>();
for ( DesignElementDataVector vector : vectors ) {
for ( BulkExpressionDataVector vector : vectors ) {
ArrayDesign adUsed = this.arrayDesignForVector( vector );
if ( !dimMap.containsKey( adUsed ) ) {
dimMap.put( adUsed, vector.getBioAssayDimension() );
Expand Down Expand Up @@ -421,7 +422,7 @@ public List<QuantitationType> getPreferredQTypes() {
}

for ( BioAssayDimension dimension : dimensions ) {
for ( DesignElementDataVector vector : vectors ) {
for ( BulkExpressionDataVector vector : vectors ) {
if ( !vector.getBioAssayDimension().equals( dimension ) )
continue;

Expand Down Expand Up @@ -566,7 +567,7 @@ private List<QuantitationType> getMissingValueQTypes() {
List<BioAssayDimension> dimensions = this.getBioAssayDimensions();

for ( BioAssayDimension dim : dimensions ) {
for ( DesignElementDataVector vector : vectors ) {
for ( BulkExpressionDataVector vector : vectors ) {

if ( !vector.getBioAssayDimension().equals( dim ) )
continue;
Expand All @@ -591,13 +592,13 @@ private List<QuantitationType> getMissingValueQTypes() {
/**
* @return The 'preferred' data vectors - NOT the processed data vectors!
*/
private Collection<DesignElementDataVector> getPreferredDataVectors() {
Collection<DesignElementDataVector> result = new HashSet<>();
private Collection<BulkExpressionDataVector> getPreferredDataVectors() {
Collection<BulkExpressionDataVector> result = new HashSet<>();

List<BioAssayDimension> dimensions = this.getBioAssayDimensions();
List<QuantitationType> qtypes = this.getPreferredQTypes();

for ( DesignElementDataVector vector : vectors ) {
for ( BulkExpressionDataVector vector : vectors ) {
if ( !( vector instanceof ProcessedExpressionDataVector ) && dimensions
.contains( vector.getBioAssayDimension() ) && qtypes.contains( vector.getQuantitationType() ) )
result.add( vector );
Expand All @@ -620,7 +621,7 @@ private Collection<ProcessedExpressionDataVector> getProcessedDataVectors() {
List<BioAssayDimension> dimensions = this.getBioAssayDimensions();
List<QuantitationType> qtypes = this.getPreferredQTypes();

for ( DesignElementDataVector vector : vectors ) {
for ( BulkExpressionDataVector vector : vectors ) {
if ( vector instanceof ProcessedExpressionDataVector && dimensions.contains( vector.getBioAssayDimension() )
&& qtypes.contains( vector.getQuantitationType() ) )
result.add( ( ProcessedExpressionDataVector ) vector );
Expand All @@ -644,7 +645,7 @@ private QuantitationTypeData getQuantitationTypesNeeded() {

Collection<QuantitationType> checkedQts = new HashSet<>();

for ( DesignElementDataVector vector : vectors ) {
for ( BulkExpressionDataVector vector : vectors ) {

BioAssayDimension dim = vector.getBioAssayDimension();

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,10 +31,7 @@
import ubic.gemma.core.datastructure.matrix.ExpressionDataMatrixRowElement;
import ubic.gemma.model.common.auditAndSecurity.eventType.MissingValueAnalysisEvent;
import ubic.gemma.model.common.quantitationtype.*;
import ubic.gemma.model.expression.bioAssayData.BioAssayDimension;
import ubic.gemma.model.expression.bioAssayData.DesignElementDataVector;
import ubic.gemma.model.expression.bioAssayData.ProcessedExpressionDataVector;
import ubic.gemma.model.expression.bioAssayData.RawExpressionDataVector;
import ubic.gemma.model.expression.bioAssayData.*;
import ubic.gemma.model.expression.designElement.CompositeSequence;
import ubic.gemma.model.expression.experiment.ExpressionExperiment;
import ubic.gemma.persistence.service.common.auditAndSecurity.AuditTrailService;
Expand Down Expand Up @@ -122,7 +119,7 @@ public Collection<RawExpressionDataVector> computeMissingValues( ExpressionExper
timer.stop();
this.logTimeInfo( timer, procVectors.size() + rawVectors.size() );

Collection<? extends DesignElementDataVector> builderVectors = new HashSet<>(
Collection<? extends BulkExpressionDataVector> builderVectors = new HashSet<>(
rawVectors.isEmpty() ? procVectors : rawVectors );

ExpressionDataMatrixBuilder builder = new ExpressionDataMatrixBuilder( builderVectors );
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@
import ubic.gemma.model.common.quantitationtype.QuantitationType;
import ubic.gemma.model.expression.arrayDesign.ArrayDesign;
import ubic.gemma.model.expression.bioAssayData.DesignElementDataVector;
import ubic.gemma.model.expression.bioAssayData.BulkExpressionDataVector;
import ubic.gemma.model.expression.designElement.CompositeSequence;
import ubic.gemma.model.expression.experiment.*;
import ubic.gemma.model.genome.Taxon;
Expand Down Expand Up @@ -481,7 +482,7 @@ public File writeOrLocateDataFile( QuantitationType type, boolean forceWrite ) {
ExpressionDataFileServiceImpl.log
.info( "Creating new quantitation type expression data file: " + f.getName() );

Collection<DesignElementDataVector> vectors = rawAndProcessedExpressionDataVectorService.findAndThaw( type );
Collection<BulkExpressionDataVector> vectors = rawAndProcessedExpressionDataVectorService.findAndThaw( type );
Collection<ArrayDesign> arrayDesigns = this.getArrayDesigns( vectors );
Map<CompositeSequence, String[]> geneAnnotations = this.getGeneAnnotationsAsStringsByProbe( arrayDesigns );

Expand Down Expand Up @@ -564,7 +565,7 @@ public File writeOrLocateJSONDataFile( QuantitationType type, boolean forceWrite

ExpressionDataFileServiceImpl.log.info( "Creating new quantitation type JSON data file: " + f.getName() );

Collection<DesignElementDataVector> vectors = rawAndProcessedExpressionDataVectorService.findAndThaw( type );
Collection<BulkExpressionDataVector> vectors = rawAndProcessedExpressionDataVectorService.findAndThaw( type );

if ( vectors.size() == 0 ) {
ExpressionDataFileServiceImpl.log.warn( "No vectors for " + type );
Expand Down Expand Up @@ -1173,7 +1174,7 @@ private File writeDesignMatrix( File file, ExpressionExperiment expressionExperi
return file;
}

private void writeJson( File file, Collection<DesignElementDataVector> vectors ) throws IOException {
private void writeJson( File file, Collection<BulkExpressionDataVector> vectors ) throws IOException {
ExpressionDataMatrix<?> expressionDataMatrix = ExpressionDataMatrixBuilder.getMatrix( vectors );
try ( Writer writer = new OutputStreamWriter( new GZIPOutputStream( new FileOutputStream( file ) ) ) ) {
MatrixWriter matrixWriter = new MatrixWriter();
Expand Down Expand Up @@ -1211,7 +1212,7 @@ private void writeMatrix( File file, Map<CompositeSequence, String[]> geneAnnota

}

private void writeVectors( File file, Collection<DesignElementDataVector> vectors,
private void writeVectors( File file, Collection<BulkExpressionDataVector> vectors,
Map<CompositeSequence, String[]> geneAnnotations ) throws IOException {
ExpressionDataMatrix<?> expressionDataMatrix = ExpressionDataMatrixBuilder.getMatrix( vectors );
this.writeMatrix( file, geneAnnotations, expressionDataMatrix );
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,8 @@
import ubic.gemma.model.expression.arrayDesign.ArrayDesign;
import ubic.gemma.model.expression.bioAssay.BioAssay;
import ubic.gemma.model.expression.bioAssayData.BioAssayDimension;
import ubic.gemma.model.expression.bioAssayData.DesignElementDataVector;
import ubic.gemma.model.expression.bioAssayData.RawExpressionDataVector;
import ubic.gemma.model.expression.bioAssayData.BulkExpressionDataVector;
import ubic.gemma.model.expression.biomaterial.BioMaterial;
import ubic.gemma.model.expression.designElement.CompositeSequence;
import ubic.gemma.model.expression.experiment.ExpressionExperiment;
Expand Down Expand Up @@ -204,7 +204,7 @@ public ExpressionDataMatrixRowElement getRowElement( int index ) {
}

@SuppressWarnings("unused") // useful interface
protected abstract void vectorsToMatrix( Collection<? extends DesignElementDataVector> vectors );
protected abstract void vectorsToMatrix( Collection<? extends BulkExpressionDataVector> vectors );

int getColumnIndex( BioAssay bioAssay ) {
return columnAssayMap.get( bioAssay );
Expand Down Expand Up @@ -368,11 +368,11 @@ int setUpColumnElements() {
/**
* Selects all the vectors passed in (uses them to initialize the data)
*/
void selectVectors( Collection<? extends DesignElementDataVector> vectors ) {
void selectVectors( Collection<? extends BulkExpressionDataVector> vectors ) {
QuantitationType quantitationType = null;
int i = 0;
List<DesignElementDataVector> sorted = this.sortVectorsByDesignElement( vectors );
for ( DesignElementDataVector vector : sorted ) {
List<BulkExpressionDataVector> sorted = this.sortVectorsByDesignElement( vectors );
for ( BulkExpressionDataVector vector : sorted ) {
if ( this.expressionExperiment == null )
this.expressionExperiment = vector.getExpressionExperiment();
QuantitationType vectorQuantitationType = vector.getQuantitationType();
Expand All @@ -397,14 +397,14 @@ void selectVectors( Collection<? extends DesignElementDataVector> vectors ) {

}

Collection<DesignElementDataVector> selectVectors( Collection<? extends DesignElementDataVector> vectors,
Collection<BulkExpressionDataVector> selectVectors( Collection<? extends BulkExpressionDataVector> vectors,
Collection<QuantitationType> qTypes ) {
this.quantitationTypes.addAll( qTypes );

Collection<DesignElementDataVector> vectorsOfInterest = new LinkedHashSet<>();
Collection<BulkExpressionDataVector> vectorsOfInterest = new LinkedHashSet<>();
int i = 0;

for ( DesignElementDataVector vector : vectors ) {
for ( BulkExpressionDataVector vector : vectors ) {
QuantitationType vectorQuantitationType = vector.getQuantitationType();
if ( qTypes.contains( vectorQuantitationType ) ) {
if ( this.expressionExperiment == null )
Expand All @@ -421,14 +421,14 @@ Collection<DesignElementDataVector> selectVectors( Collection<? extends DesignEl
return vectorsOfInterest;
}

Collection<DesignElementDataVector> selectVectors( Collection<? extends DesignElementDataVector> vectors,
Collection<BulkExpressionDataVector> selectVectors( Collection<? extends BulkExpressionDataVector> vectors,
List<QuantitationType> qTypes ) {
this.quantitationTypes.addAll( qTypes );
List<DesignElementDataVector> sorted = this.sortVectorsByDesignElement( vectors );
Collection<DesignElementDataVector> vectorsOfInterest = new LinkedHashSet<>();
List<BulkExpressionDataVector> sorted = this.sortVectorsByDesignElement( vectors );
Collection<BulkExpressionDataVector> vectorsOfInterest = new LinkedHashSet<>();
int rowIndex = 0;
for ( QuantitationType soughtType : qTypes ) {
for ( DesignElementDataVector vector : sorted ) {
for ( BulkExpressionDataVector vector : sorted ) {
QuantitationType vectorQuantitationType = vector.getQuantitationType();
if ( vectorQuantitationType.equals( soughtType ) ) {
if ( this.expressionExperiment == null )
Expand All @@ -446,14 +446,14 @@ Collection<DesignElementDataVector> selectVectors( Collection<? extends DesignEl
return vectorsOfInterest;
}

Collection<DesignElementDataVector> selectVectors( Collection<? extends DesignElementDataVector> vectors,
Collection<BulkExpressionDataVector> selectVectors( Collection<? extends BulkExpressionDataVector> vectors,
QuantitationType quantitationType ) {
this.quantitationTypes.add( quantitationType );

Collection<DesignElementDataVector> vectorsOfInterest = new LinkedHashSet<>();
Collection<BulkExpressionDataVector> vectorsOfInterest = new LinkedHashSet<>();
int i = 0;

for ( DesignElementDataVector vector : vectors ) {
for ( BulkExpressionDataVector vector : vectors ) {
QuantitationType vectorQuantitationType = vector.getQuantitationType();
if ( vectorQuantitationType.equals( quantitationType ) ) {
if ( this.expressionExperiment == null )
Expand All @@ -470,18 +470,18 @@ Collection<DesignElementDataVector> selectVectors( Collection<? extends DesignEl
return vectorsOfInterest;
}

Collection<DesignElementDataVector> selectVectors( ExpressionExperiment ee, QuantitationType quantitationType ) {
Collection<BulkExpressionDataVector> selectVectors( ExpressionExperiment ee, QuantitationType quantitationType ) {
Collection<RawExpressionDataVector> vectors = ee.getRawExpressionDataVectors();
return this.selectVectors( quantitationType, vectors );
}

private Collection<DesignElementDataVector> selectVectors( QuantitationType quantitationType,
Collection<? extends DesignElementDataVector> vectors ) {
Collection<DesignElementDataVector> vectorsOfInterest = new LinkedHashSet<>();
private Collection<BulkExpressionDataVector> selectVectors( QuantitationType quantitationType,
Collection<? extends BulkExpressionDataVector> vectors ) {
Collection<BulkExpressionDataVector> vectorsOfInterest = new LinkedHashSet<>();
this.quantitationTypes.add( quantitationType );
List<DesignElementDataVector> sorted = this.sortVectorsByDesignElement( vectors );
List<BulkExpressionDataVector> sorted = this.sortVectorsByDesignElement( vectors );
int i = 0;
for ( DesignElementDataVector vector : sorted ) {
for ( BulkExpressionDataVector vector : sorted ) {
QuantitationType vectorQuantitationType = vector.getQuantitationType();
if ( this.expressionExperiment == null )
this.expressionExperiment = vector.getExpressionExperiment();
Expand Down Expand Up @@ -512,12 +512,12 @@ private void getBioMaterialGroupsForAssays( Map<BioMaterial, Collection<BioAssay
}
}

private List<DesignElementDataVector> sortVectorsByDesignElement(
Collection<? extends DesignElementDataVector> vectors ) {
List<DesignElementDataVector> vectorSort = new ArrayList<>( vectors );
Comparator<DesignElementDataVector> cmp = Comparator
.comparing( ( DesignElementDataVector vector ) -> vector.getDesignElement().getName(), Comparator.nullsLast( Comparator.naturalOrder() ) )
.thenComparing( ( DesignElementDataVector vector ) -> vector.getDesignElement().getId() );
private List<BulkExpressionDataVector> sortVectorsByDesignElement(
Collection<? extends BulkExpressionDataVector> vectors ) {
List<BulkExpressionDataVector> vectorSort = new ArrayList<>( vectors );
Comparator<BulkExpressionDataVector> cmp = Comparator
.comparing( ( BulkExpressionDataVector vector ) -> vector.getDesignElement().getName(), Comparator.nullsLast( Comparator.naturalOrder() ) )
.thenComparing( ( BulkExpressionDataVector vector ) -> vector.getDesignElement().getId() );
vectorSort.sort( cmp );
return vectorSort;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@

import ubic.gemma.model.expression.bioAssay.BioAssay;
import ubic.gemma.model.expression.bioAssayData.BioAssayDimension;
import ubic.gemma.model.expression.bioAssayData.DesignElementDataVector;
import ubic.gemma.model.expression.bioAssayData.BulkExpressionDataVector;
import ubic.gemma.model.expression.designElement.CompositeSequence;

import java.util.Collection;
Expand Down Expand Up @@ -130,7 +130,7 @@ public void set( int row, int column, Object value ) {
}

@Override
protected void vectorsToMatrix( Collection<? extends DesignElementDataVector> vectors ) {
protected void vectorsToMatrix( Collection<? extends BulkExpressionDataVector> vectors ) {
throw new UnsupportedOperationException();
}

Expand Down
Loading