Quick and dirty tutorials and templates
Here are some examples (go into directories for more):
Example (there are more): sentiment_analysis_emotion.ipynb
Example (there are more):
classifier.ipynb
# Plotting and input/output in `plotting_and_io/`
brew install rename
rename 's/cleaned/16khz/' *.wav
text speech emails_from_python
- First create a script called
email_config.py
with these variables using your info:
port = 465 #For SSL. eg, 465
smtp_server = "outgoing.mit.edu" #eg, "outgoing.mit.edu" with the "outgoing" part
from_email_actual = "username" # actual email sent by. just the username of the above email. eg., "[email protected]"
from_email_appears = "[email protected]" # It will appeared as sent by this email "[email protected]" or it could be the same one as from_email_actual. You need authorization to send from other emails which you can configure in your email settings.
also_send_to = ['[email protected]', '[email protected]'] # you can Add email to receive copy or leave empty list, but note some .edu accounts cannot send to themselves. These emails will not be seen by recipient.
cc = None # This will appear, but note some .edu accounts cannot send to themselves
testing = True #send the emails to the email specified in testing_to_email to test everything is running well and the html formatting looks right.
testing_to_email = '[email protected]'
testing_append_subject_line = '[Test] '
-
email_content.py
Here you define the email subject line and body of the different types of emails. Use HTML (e.g.,<br>
for line breaks, etc.). -
Define a CSV file with emails and email_type. This will allow
email_send.py
to call extract certain body and subject lines fromemail_content.py
. You will call this csv file in the argument, see below. In my example, include columnsto_email
,name
,email_type
, andprizes
, but you can include whatever you need and change inemail_content.py
accordingly. -
Run script
This will prompt you for the password of from_email_actual
python3 email_send.py --path_to_dataframe=path/to/dataframe.csv
audio_annotation.ipynb
is a Colab approach.
audio_annotation.py
takes data from ./data/input/vfp_audios_16khz/
(just the first third of the speech task) dir and outputs a DF in ./data/outputs/annotations/
Obtain vfp_audios_16khz/
cannot be shared. Ask for it.
After verifying the configuration (paths, instructions, how many seconds to play) in the first few lines, run:
Example:
pip3 install playsound PyObjC pandas pyaudio
python3 annotation.py --input_dir=data/input/vfp_audios_16khz/ --output_dir=data/outputs/annotations/
pip3 install pydub python3 convert_mp3.py --input_dir='data/input/ --output_dir= --output_format=wav --output_bitrate=32k
Most speech occurs below 8kHz. Therefore downsampling to 16kHz is enough to capture most speech-related frequencies and information (see Nyquist rate). Many algorithms require samples to be at 16kHz (for faster processing or normalization across samples) while many recordings are done at 22 or 44kHz.
sh downsample_16khz.sh
speech_activity_detection_pyannote.ipynb
detects and plots speech and silences using Pyannote package. You can use recordings in data/input/audio_samples
to test