You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
diacritized_paragraphs = []
paragraphs = message.split("\n")
for paragraph in paragraphs:
sentences = paragraph.split(". ")
diacritized_sentences = []
for sentence in sentences:
words = sentence.split()
diacritized_words = tagger.tag(words)
diacritized_sentence = ' '.join(diacritized_words)
diacritized_sentences.append(diacritized_sentence)
diacritized_paragraph = '. '.join(diacritized_sentences)
diacritized_paragraphs.append(diacritized_paragraph)
diacritized_message = '\n'.join(diacritized_paragraphs)
with open('output.txt', 'w', encoding='utf-8') as f:
f.write(diacritized_message)
# print the number of characters in the output
print(f"Number of characters in the output: {len(diacritized_message)}")
#main.py
from NonDiacritics import removeDiacritics
from Diacritics import addDiacritics
sentence = """ بسم الله الرحمن الرحيم"""
#removeDiacritics(sentence)
addDiacritics(sentence)
Expected behavior
Output: it is supposed to print the sentence in the output.txt file with diacritics.
Desktop (please complete the following information):
OS [e.g. Windows, macOS, Linux, etc] along with OS version: Win 11; latest update
Python version: Python 3.9.18
CAMeL Tools version as well as installation source (pip, conda, source). If installed from source, specify which branch [e.g. master] and/or commit hash.
Successfully installed camel-tools-1.5.2
Additional context
The last time I used camel-tools to build diacritics was something around August, and this code was working properly with my needs. However, it is no longer working.
The text was updated successfully, but these errors were encountered:
Describe the bug
Unable to add diacritics using CAMeL Tools.
To Reproduce
Running the addDiacritics in Diacritics.py
Provide any Python/Shell scripts as code blocks.
Diacritics.py
from camel_tools.tagger.default import DefaultTagger
from camel_tools.disambig.bert import BERTUnfactoredDisambiguator
def addDiacritics(message):
bertd = BERTUnfactoredDisambiguator.pretrained('msa')
tagger = DefaultTagger(bertd, 'diac')
#main.py
from NonDiacritics import removeDiacritics
from Diacritics import addDiacritics
sentence = """ بسم الله الرحمن الرحيم"""
#removeDiacritics(sentence)
addDiacritics(sentence)
Expected behavior
Output: it is supposed to print the sentence in the output.txt file with diacritics.
Screenshots
'C:\Users\UserName\AppData\Roaming\camel_tools\data\disambig_bert_unfactored\msa\default_config.json'
Desktop (please complete the following information):
OS [e.g. Windows, macOS, Linux, etc] along with OS version: Win 11; latest update
Python version: Python 3.9.18
CAMeL Tools version as well as installation source (pip, conda, source). If installed from source, specify which branch [e.g. master] and/or commit hash.
Successfully installed camel-tools-1.5.2
Additional context
The last time I used camel-tools to build diacritics was something around August, and this code was working properly with my needs. However, it is no longer working.
The text was updated successfully, but these errors were encountered: