Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SCOWL project maintenance #394

Open
Jamim opened this issue Jan 28, 2024 · 7 comments
Open

SCOWL project maintenance #394

Jamim opened this issue Jan 28, 2024 · 7 comments

Comments

@Jamim
Copy link

Jamim commented Jan 28, 2024

πŸ‘‹πŸ» Hello @kevina,

First of all, thank you for all your work on this database of English words! πŸ™‡πŸΌ

🏚️

Sadly, this repository now looks completely abandoned.
There are over a hundred of issues awaiting resolution and 3 PRs awaiting feedback for years.

🏑

I understand you probably have no spare time to spend on SCOWL, but would you mind considering delegating maintenance to some community members that you can trust to keep the project going?

Best regards!

@kevina
Copy link
Member

kevina commented Jan 29, 2024

Updating the wordlist has been a low priority but the repo is not completely abandoned. I tend to add words in large batches and it has been a long time since I did an update.

I am open to delegating, the main thing is finding people who share the same views on what type of words should be included and has the necessary technical skills to add them.

As far as the apostrophe handling see #122, I am not sure what the correct path forward is.

@kevina kevina pinned this issue Jan 29, 2024
@kevina kevina changed the title SCOWL project maintenance πŸ› οΈ SCOWL project maintenance Jan 29, 2024
@Meekohi
Copy link

Meekohi commented Jan 29, 2024

Perhaps splitting the "technical maintenance" and the "which words are allowed decisions" might be a way to have the best of both worlds. e.g. @kevina / @biljir "approve" words to be added and have a small group of contributors who can make sure it is done properly from a technical perspective.

tbh I don't think it's a big issue to continue as-is (how often does the wordlist really need to change), but maybe it would be nice to see faster resolution times when people propose words to be added just so there isn't the impression things are being ignored.

Clearing out some of the years old issues would also improve the perception that things are being maintained etc.

@marcoagpinto

This comment was marked as off-topic.

@kevina

This comment was marked as off-topic.

@kevina
Copy link
Member

kevina commented Apr 8, 2024

Currently SCOWL is not in a state that I am comfortable passing on to anyone. SCOWL was originally about combining high quality word lists and the mechanism for making corrections is very hackish. A while ago I posted a database version of SCOWL (#306). While this is an improvement it doesn't address the core issue of maintainability and can likely make things worse. The SQL used to convert from the source lists to the database form is beyond complex and not something I want to pass on to anyone, let alone release to the public.

Instead my current plan is to create a text file containing most of the information in the database. This file will combine all aspects of SCOWL (the lists themself), VarCon (the variant conversions) and AGID (the POS and inflection information) into one place. For example the entries for color might be:

35: A Cv DV: color <n>: colors  color's 
35: B C D: colour <n>: colours  colour's 

35: A Cv DV: color <v>: colored  coloring  colors 
35: B C D: colour <v>: coloured  colouring  colours

This new file will then become the source for SCOWL, in that all words lists will be created from this master file. To add new words or to make corrections, all one has to do is edit this file. Naturally this will lose the ability to create SCOWL from the source lists, but I no longer think that is worth it.

I will also write some python code to aid in maintaining this file and to catch errors.

Once this is done I will likely start adding new words again as that will serve as a good test of the new format and scripts to maintain it.

Once I am happy with the new format, I will be willing to hand off the maintenance of SCOWL to other people I trust.

@marcoagpinto
Copy link

Heya, Kevin,

It is good to know that you intend to hand off the maintenance of the dictionaries.

I hope new words can be added more frequently.

I, myself, have added 140 000+ words to English British in 11 years.

It is a life-time task.

@kevina
Copy link
Member

kevina commented May 24, 2024

A preview version of the new format is available in the v2 branch of this repo.

The documentation is still incomplete but the format should mostly be mostly stable by now.

Early feedback is welcome, but please use #398 to leave feedback or create a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants