Skip to content

Latest commit

 

History

History
110 lines (104 loc) · 11.7 KB

CHANGELOG.md

File metadata and controls

110 lines (104 loc) · 11.7 KB

Changelog

All notable changes to this project will be documented in this file. The format is based on Keep a Changelog.

[0.5.0] - 2023-MM-DD

Added

  • Added PyTorch 2.4 support (#338)
  • Added PyTorch 2.3 support (#322)
  • Added Windows support (#315)
  • Added macOS Apple Silicon support (#310)

Changed

Removed

[0.4.0] - 2024-02-07

Added

  • Added PyTorch 2.2 support (#294)
  • Added softmax_csr implementation (#264, #282)
  • Added support for edge-level sampling (#280)
  • Added support for bfloat16 data type in segment_matmul and grouped_matmul (CPU only) (#272)

Changed

  • Dropped the MKL code path when sampling neighbors with replace=False since it does not correctly prevent duplicates (#275)
  • Added --biased parameter to run benchmarks for biased sampling (#267)
  • Improved speed of biased sampling (#270)
  • Fixed grouped_matmul when tensors are not contiguous (#290)

Removed

[0.3.0] - 2023-10-11

Added

  • Added PyTorch 2.1 support (#256)
  • Added low-level support for distributed neighborhood sampling (#246, #252, #253, #254)
  • Added support for homogeneous and heterogeneous biased neighborhood sampling (#247, #251)
  • Added dispatch for XPU device in index_sort (#243)
  • Added metis partitioning (#229)
  • Enable hetero_neighbor_samplee to work in parallel (#211)

Changed

  • Fixed vector-based mapping issue in Mapping (#244)
  • Fixed performance issues reported by Coverity Tool (#240)
  • Updated cutlass version for speed boosts in segment_matmul and grouped_matmul (#235)
  • Drop nested tensor wrapper for grouped_matmul implementation (#226)
  • Added generate_range_of_ints function (it uses MKL library in order to generate ints) to RandintEngine class (#222)
  • Fixed TorchScript support in grouped_matmul (#220)

Removed

[0.2.0] - 2023-03-22

Added

  • Added PyTorch 2.0 support (#214)
  • neighbor_sample routines now also return information about the number of sampled nodes/edges per layer (#197)
  • Added index_sort implementation (#181, #192)
  • Added triton>=2.0 support (#171)
  • Added bias term to grouped_matmul and segment_matmul (#161)
  • Added sampled_op implementation (#156, #159, #160)

Changed

  • Improved [segment|grouped]_matmul GPU implementation by reducing launch overheads (#213)
  • Sample the nodes with the same timestamp as seed nodes (#187)
  • Added write-csv (saves benchmark results as csv file) and libraries (determines which libraries will be used in benchmark) parameters (#167)
  • Enable benchmarking of neighbor sampler on temporal graphs (#165)
  • Improved [segment|grouped]_matmul CPU implementation via at::matmul_out and MKL BLAS gemm_batch (#146, #172)

Removed

[0.1.0] - 2022-11-28

Added

  • Added PyTorch 1.13 support (#145)
  • Added native PyTorch support for grouped_matmul (#137)
  • Added fused_scatter_reduce operation for multiple reductions (#141, #142)
  • Added triton dependency (#133, #134)
  • Enable pytest testing (#132)
  • Added C++-based autograd and TorchScript support for segment_matmul (#120, #122)
  • Allow overriding time for seed nodes via seed_time in neighbor_sample (#118)
  • Added [segment|grouped]_matmul CPU implementation (#111)
  • Added temporal_strategy option to neighbor_sample (#114)
  • Added benchmarking tool (Google Benchmark) along with pyg::sampler::Mapper benchmark example (#101)
  • Added CSC mode to pyg::sampler::neighbor_sample and pyg::sampler::hetero_neighbor_sample (#95, #96)
  • Speed up pyg::sampler::neighbor_sample via IndexTracker implementation (#84)
  • Added pyg::sampler::hetero_neighbor_sample implementation (#90, #92, #94, #97, #98, #99, #102, #110)
  • Added pyg::utils::to_vector implementation (#88)
  • Added support for PyTorch 1.12 (#57, #58)
  • Added grouped_matmul and segment_matmul CUDA implementations via cutlass (#51, #56, #61, #64, #69, #73, #123)
  • Added pyg::sampler::neighbor_sample implementation (#54, #76, #77, #78, #80, #81), #85, #86, #87, #89)
  • Added pyg::sampler::Mapper utility for mapping global to local node indices (#45, #83)
  • Added benchmark script (#45, #79, #82, #91, #93, #106)
  • Added download script for benchmark data (#44)
  • Added biased sampling utils (#38)
  • Added CHANGELOG.md (#39)
  • Added pyg.subgraph() (#31)
  • Added nightly builds (#28, #36)
  • Added rand CPU engine (#26, #29, #32, #33)
  • Added pyg.random_walk() (#21, #24, #25)
  • Added documentation via readthedocs (#19, #20)
  • Added code coverage report (#15, #16, #17, #18)
  • Added CMakeExtension support (#14)
  • Added test suite via gtest (#13)
  • Added clang-format linting via pre-commit (#12)
  • Added CMake support (#5)
  • Added pyg.cuda_version() (#4)

Changed

  • Allow different types for graph and timestamp data (#143)
  • Fixed dispatcher in hetero_neighbor_sample (#125)
  • Require sorted neighborhoods according to time in temporal sampling (#108)
  • Only sample neighbors with a strictly earlier timestamp than the seed node (#104)
  • Prevent absolute paths in wheel (#75)
  • Improved installation instructions (#68)
  • Replaced std::unordered_map with a faster phmap::flat_hash_map (#65)
  • Fixed versions of checkout and setup-python in CI (#52)
  • Make use of the pyg_sphinx_theme documentation template (#47)
  • Auto-compute number of threads and blocks in CUDA kernels (#41)
  • Optional return types in pyg.subgraph() (#40)
  • Absolute headers (#30)
  • Use at::equal rather than at::all in tests (#37)
  • Build *.so extension on Mac instead of *.dylib(#107)