-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Normalization of microtext #140
Comments
@salonipriyani What is the update on this? |
@vishakha-lall I would like to work on this issue. Can you please assign this to me as part of GSSoC'20? |
@Rukmini-Meda Please share your approach for this issue, what kind of normalisation would you be doing, where will you get the data to train the normalisation on? |
Is it like gathering data for some common abbreviations? |
https://github.com/npuliyang/Microtext_Normalization |
@shreyanshi2228 Please elaborate on how you plan to use the shared resource with respect to the requirements of this project. Abbreviations to common cities and locations would be more relevant to this project. |
Is your feature request related to a problem? Please describe.
The text from the user could be erroneous (due to typos or spelling mistakes) or have abbreviated words (BLR for Bangalore Airport).
Describe the solution you'd like
First, using a lexical approach, the abbreviations and acronyms will be handled and then using a phonetic algorithm, Soundex, the spelling mistakes will be corrected.
Describe alternatives you've considered
Additional context
The text was updated successfully, but these errors were encountered: