You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Over the years I have been working to convert SCOWL from a collection of word lists combined with sort, uniq, comm, some Perl scripts, and an extremely complicated Makefile into a true SQL database.
In the new database words are broken down into different senses. For example there is a separate entry for a noun and the verb form of a word. There may also be separate entries for different meanings of a word within the same part of speech (POS). In addition inflected forms of a word are linked with there root word, for example "running" will be linked with the root "run". Variants of a word are linked with a particular sense of a word rather than the word itself to handle the many corner cases when the correct spelling depends on the meaning or POS.
The processing of the source word lists and other information (such as VarCon) into individual entries is vastly more complicated than what is currently used to create SCOWL. However, it is also what I hope to be more maintainable in the long term, as not even I fully understand the vastly complicated Makefile that drives SCOWL.
The new database contains the same information as SCOWL but the resulting wordlists are slightly different due to vastly different processing.
It is almost ready but I am unsure what direction I am going to take it. In particular due to the tremendous amount of work that went into it I am not sure if I am going to release everything at once unless I get some sort of funding.
I am very interested in some early feedback and if what I have will be useful, beyond a better way to generated high quality word lists.
If you are interested in a preview version of the database please email me directly (you can find my email on my GitHub profile) with your GitHub user id. I will verify that your account is linked with a real person who has at least some presence on the web and will grant you access to the repo that contains the preview version when it is ready. If your GitHub profile is empty or there is something suspicious about your account I may ignore you.
This is an announcement, I am locking this issue to discourage people from requesting access by replying to this issue and not emailing me. Please leave general feedback you wish to make public on issue #307.
[Draft, see comment modification history for revision date]
The text was updated successfully, but these errors were encountered:
One of the advantages to the SQL Database version of SCOWL is the ability to create word lists with special symbols in them, including compound words with a space or hyphen. As a preview of what is possible check out the enhanced version of the custom word list creator at https://devel.kevina.org/cgi-bin/create, the password is scowldb2.
Both these tools are experimental and subject to change or be removed. They are also hosted on a temporary sever and may go down from time to time. If you wish to use them and they are not available please email me directly.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Over the years I have been working to convert SCOWL from a collection of word lists combined with
sort
,uniq
,comm
, some Perl scripts, and an extremely complicated Makefile into a true SQL database.In the new database words are broken down into different senses. For example there is a separate entry for a noun and the verb form of a word. There may also be separate entries for different meanings of a word within the same part of speech (POS). In addition inflected forms of a word are linked with there root word, for example "running" will be linked with the root "run". Variants of a word are linked with a particular sense of a word rather than the word itself to handle the many corner cases when the correct spelling depends on the meaning or POS.
The processing of the source word lists and other information (such as VarCon) into individual entries is vastly more complicated than what is currently used to create SCOWL. However, it is also what I hope to be more maintainable in the long term, as not even I fully understand the vastly complicated Makefile that drives SCOWL.
The new database contains the same information as SCOWL but the resulting wordlists are slightly different due to vastly different processing.
It is almost ready but I am unsure what direction I am going to take it. In particular due to the tremendous amount of work that went into it I am not sure if I am going to release everything at once unless I get some sort of funding.
I am very interested in some early feedback and if what I have will be useful, beyond a better way to generated high quality word lists.
If you are interested in a preview version of the database please email me directly (you can find my email on my GitHub profile) with your GitHub user id. I will verify that your account is linked with a real person who has at least some presence on the web and will grant you access to the repo that contains the preview version when it is ready. If your GitHub profile is empty or there is something suspicious about your account I may ignore you.
This is an announcement, I am locking this issue to discourage people from requesting access by replying to this issue and not emailing me. Please leave general feedback you wish to make public on issue #307.
[Draft, see comment modification history for revision date]
The text was updated successfully, but these errors were encountered: