v0.1.0
New Features
Lilac now supports labeling! For a detailed guide, see Labeling a dataset
Labels can be added for individual rows:
dataset.add_labels(
'good',
row_ids=['0003076800f1471f8f4c8a1b2deda742'])
Or for slices of the data:
dataset.add_labels(
'short',
filters=[
(('text', 'text_statistics', 'num_characters'), 'less', 1000)
]
)
They can then be exported:
short_rows = list(
dataset.select_rows(
['*', 'short'],
filters=[
(('short', 'label'), 'exists')
]
)
)
# Print the first row.
print(short_rows[0])
Output:
{
'__rowid__': '0003076800f1471f8f4c8a1b2deda742',
'text': 'If you want to truly experience the magic (?) of Don Dohler, then check out "Alien Factor" or maybe "Fiend", but not this. Alien Factor is actually rather imaginative considering the low budget and it\'s fairly creepy, but "Nightbeast", which I guess is sort of an updating of Alien Factor, is just plain dumb. Actors sleepwalk through their roles, especially Mr. Monotone sheriff, and the monster is some dumb Halloween-mask kind of thing instead of the wildly imaginative (but kind of stupid) looking critters from Alien Factor. A spaceship crashes on Earth and there\'s a critter inside, of course, who runs around vaporizing people. And ripping off arms, etc. And he has a cool ray gun that he uses to vaporize people too, until it gets shot out of his hand. And that\'s really about it. "Alien Factor" beats this mess hands down, if you really want to see a good Don Dohler movie, check that out instead. And RIP Don Dohler, 12/2/06.',
'label': 'neg',
'__hfsplit__': 'test',
'good': {
'label': 'true',
'created': datetime.datetime(2023, 9, 20, 10, 16, 15, 545277)
}
}
Labels can also be added via the UI:
What's changed
Bug fixes
- Allow
add_labels
andremove_labels
without selection by @dsmilkov in #698 - Fix UI regression and empty
lilac.yml
(no datasets) by @dsmilkov in #700
Full Changelog: v0.0.20...v0.1.0