-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pt] Added words to added.txt and spelling.txt #11018
Conversation
WalkthroughThe pull request introduces new entries to the Portuguese language module's part-of-speech dictionary and spelling dictionary in LanguageTool. It specifically adds variations of the term "palavra-passe" (password) to the part-of-speech dictionary and expands the spelling dictionary with a list of questionable foreign words and additional Portuguese terms. The new entries maintain the existing file structure and formatting, with no changes to previously existing entries. Changes
Possibly related PRs
Suggested reviewers
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (1)
languagetool-language-modules/pt/src/main/resources/org/languagetool/resource/pt/spelling.txt (1)
374-376
: Consider reorganizing the new termsThe password-related terms are currently added at the end of the file, but they would fit better in the section "Added by Marco A.G.Pinto" with other Portuguese words, maintaining alphabetical order.
Consider moving these entries to maintain better organization:
# Current location palavra-passe palavras-passe palavras-passes # Should be moved to the Portuguese words section, maintaining alphabetical order
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
languagetool-language-modules/pt/src/main/resources/org/languagetool/resource/pt/added.txt
(1 hunks)languagetool-language-modules/pt/src/main/resources/org/languagetool/resource/pt/spelling.txt
(1 hunks)
🔇 Additional comments (2)
languagetool-language-modules/pt/src/main/resources/org/languagetool/resource/pt/added.txt (2)
235-237
: LGTM! The entries follow the correct format and structure.
The new entries:
- Are properly placed in the manual entries section
- Follow the correct tab-separated format: fullform, baseform, postags
- Use consistent postags (NCFS000 for singular, NCFP000 for plural)
- Maintain "palavra-passe" as the baseform for all variations
235-237
: Verify the validity of both plural forms in Portuguese.
The entries include two plural forms for "palavra-passe":
- "palavras-passe" (standard plural form)
- "palavras-passes" (alternative plural form)
While both forms appear in common usage, let's verify their validity according to Portuguese language standards.
✅ Verification successful
Both plural forms are valid and already recognized in the codebase
The verification shows that both plural forms ("palavras-passe" and "palavras-passes") are:
- Already present in the Portuguese spelling dictionary (
spelling.txt
) - Used as examples in grammar rules (
grammar.xml
) - Consistently handled across the Portuguese language module
The addition of these forms to added.txt
with proper morphological tags aligns with the existing codebase implementation and Portuguese language standards.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Search for usage patterns of both plural forms in the codebase and documentation
# Check if these forms are used elsewhere in the codebase
rg -i "palavras-passe[s]?" --type txt
# Check if there are any existing discussions or documentation about these plural forms
rg -i "palavra.*passe" -g "*.md" -g "*.txt"
Length of output: 1272
Script:
#!/bin/bash
# Let's check if there are any grammar or style rules that handle these plural forms
# or any comments/documentation about their usage
# Check for rules or tests related to these forms
rg -i "palavra.*passe" --type xml -g "*pt*.xml" -A 5 -B 5
# Check for any comments or documentation in Portuguese rule files
rg -i "palavra.*passe" -g "*.properties" -g "*.xml" -g "*.java" -A 2 -B 2
# Look for any explicit grammar rules about hyphenated words
rg -i "hífen|hyphen" -g "*pt*.xml" -g "*pt*.properties" -A 2 -B 2
Length of output: 3930
palavra-passe | ||
palavras-passe | ||
palavras-passes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Remove the non-standard plural form "palavras-passes"
The form "palavras-passes" should be removed as it's a non-standard plural formation for compound words in Portuguese. The evidence shows:
- It's only present in these dictionary files and not in other authoritative sources
- The
added.txt
file shows it's marked with the same part-of-speech tag (NCFP000) as the standard plural "palavras-passe", indicating redundancy - In Portuguese compound words with noun+noun structure (like "palavra-passe"), the correct plural form only modifies the first noun, making "palavras-passe" the proper plural form
🔗 Analysis chain
Verify the validity of "palavras-passes"
While "palavra-passe" (singular) and "palavras-passe" (standard plural) are correct, the alternative plural form "palavras-passes" needs verification. In compound words with noun+noun structure in Portuguese, typically only the first noun is pluralized.
Let's verify this form in authoritative sources:
Let me gather more context about the usage in added.txt
since it seems to contain additional metadata.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Search for occurrences of "palavras-passes" in other Portuguese language files
# to determine if this form is used elsewhere in the codebase
echo "Searching for existing usage of 'palavras-passes'..."
rg -l "palavras-passes" --type txt
echo "Checking if this form exists in other Portuguese dictionaries..."
fd -e txt -e dic "dictionary|dict|added|spelling" | xargs rg "palavras-passes"
Length of output: 797
Script:
#!/bin/bash
# Check the context of palavras-passes in added.txt to understand the metadata
echo "Checking added.txt content and format..."
head -n 5 languagetool-language-modules/pt/src/main/resources/org/languagetool/resource/pt/added.txt
echo -e "\nChecking the full entry for palavras-passes..."
rg "palavras-passes" -A 2 -B 2 languagetool-language-modules/pt/src/main/resources/org/languagetool/resource/pt/added.txt
echo -e "\nChecking if there are other similar compound plural forms..."
rg "\w+s-\w+s\t" languagetool-language-modules/pt/src/main/resources/org/languagetool/resource/pt/added.txt
Length of output: 1163
Added very common words:
https://dicionario.priberam.org/palavra-passe
Summary by CodeRabbit
These updates aim to enhance the user experience by improving language support and accuracy in Portuguese text processing.