Skip to content

Release v1.0.2

Latest
Compare
Choose a tag to compare
@yxdyc yxdyc released this 20 Dec 12:15
a26dcc7

Major Updates

  • Added more mapper/grouper/aggregator OPs for post-tuning scenarios.
  • Optimized the distributed mode performance and usability with more automatic features.

DJ-Operators

  • extract_support_text_mapper, relation_identity_mapper, python_file_mapper, #500
  • naive_grouper, key_value_grouper, #500
  • nested_aggregator, entity_attribute_aggregator, most_relavant_entities_aggregator, #500
  • video_extract_frames_mapper, #507

Performance

  • Optimize ray mode performance, #442
  • Patch for Performance Benchmark in CI/CD workflows, #506
  • DJ Ray mode supports streaming loading of jsonl files, #515

Usability and Analysis

  • support dj-install in recipe-level, #508
  • support dj-analyze with --auto mode, #512
  • support op-wise insight auto mining, #516

Acknowledgment

Thanks to Data-Juicer users and contributors for their helpful feedback, issues and PRs!