This is developed as one of Algoritma Academy Data Analytics Specialization using capstone Projects. The deliverables of this project is a python script to send an automated generated email using SMTP using Outlook email host. We will also utilize Google's fire
package for easy interfacing with bash command.
The maximum score you will obtain from this project is 16 points:
- Environment preparation (2pts)
- Understanding the contact list creation (2pts)
- Compose email: create templates, compose messages, extract summary (6pts)
- Export a plot using matplotlib (6pts)
There are few prerequisites needed for this project, first you will need to prepare a new conda environment installed with all package dependencies. Please follow through the following command to create a new conda environment and install the dependencies:
- Create new conda environment under python 3.7:
conda create -n <env_name> python=3.7
conda activate <env_name>
- Since it is beneficial for us to create a draft using Jupyter Notebook, let's also create an iPython Kernel using the following command:
pip install ipykernel
python -m ipykernel install --user --name=<env_name>
- Direct your Terminal / Anaconda Prompt to the directory of the downloaded repository:
cd <path_to_folder>
Example:
cd "C:\\Users\Desktop\Algoritma Academy\fire-capstone-master"
- Install the requirements:
pip install -r requirements.txt
Since the program will later need to access your personal email and password, to avoid the password being hardcoded to the program, you are required to create two environment variables:
EMAIL_ADDRESS
: stores the email address you'll use as the senderEMAIL_PASSWORD
: stores the email password
To save passwords and secret keys in environment variables on Windows, you will need to open Advance System Setting
:
- In your Windows taskbar, search for
Edit the system environment variables.
- Now in
Advance System Setting
click onEnvironment Variables
. - Add new user variables by clicking
New
underUser Variables
. - In the new window you can add Variable name and Variable value. Add two new variables;
EMAIL_ADDRESS
and store the address as the value andEMAIL_PASSWORD
for the password and click ok. - Now, click
Ok
on Environment Variables window to save changes.
To check if your variables have stored correctly, run echo %EMAIL_ADDRESS%
and echo %EMAIL_PASSWORD%
in your Command Prompt. They should return your email address and password respectively
- Open up your Terminal and run
nano ~/.bashrc
- Scroll to the bottom of the file and add:
export EMAIL_ADDRESS=<your_email>
export EMAIL_PASSWORD=<your_password>
Note: There should not be any whitespace on either side of =
sign. Example:
export [email protected]
export EMAIL_PASSWORD=algorima123
- Save the nano file by pressing
ctrl + x
andY
. Then, runsource ~/.bashrc
to effect the changes.
To check if your variables have stored correctly, run echo $EMAIL_ADDRESS
and echo $EMAIL_PASSWORD
in your Terminal. They should return your email address and password respectively
In this capstone project, you will try to complete 4 challenges:
- Extract data frame information to compose an email
- Export a plot using matplotlib to be attached to the email
- Testing application modularity using
fire
- Login to your Outlook (recommended) or Gmail email using Python's SMTP library
Finally, you will need to complete the final mission by sending us an auto generated email to a specified email.
In this section you will need to extract all the variables listed in the second point from the templates object: START_DATE
, END_DATE
, TOTAL_SPENT
, TOTAL_CONVERSION
, and CPC
. This process is completed under extract_summary
function. The data we'll be using is stored under data_input
folder named data.csv
. It was downloaded from Kaggle dataset repository provided by Madis_Lemsalu. The data contains daily ads report run on Facebook, showing different marketing campaign from 18th to 30th of August 2017. Please refer to the following glossary of the variables:
ad_id
: Unique identifier of the daily adsreporting_start
: The start date of the generated report. Since it is a daily report, this variable andreporting_end
will have the same valuereporting_end
: The end date of the generated report. Since it is a daily report, this variable andreporting_start
will have the same valuecampaign_id
: unique identifier of a campaign, one campaign could have several creative adsfb_campaign_id
: Facebook's identifier of each running adsage
: The age group on which the ad is promotedgender
: The gender on which the ad is promotedinterest1
,interest2
, andinterest3
: The interest group id on which the ad is promotedimpressions
: Number of people viewing the adclicks
: Number of people clicking the adspent
: Amount of marketing cost spenttotal_conversion
: Number of conversions (commonly a buying action) happenedapproved_conversion
: Number of approved conversions after cross checked with the actual business. In some cases, a conversion tracked by the ad doesn't really record with a complete buying action from the customers.
For this step, you might want to use Jupyter Notebook to perform a more complex data wrangling process. I have prepared helper.ipynb
to help you get the summary needed for the email.
Once you're done with the wrangling process, you will be able to complete both extract_summary
and create_plot
module on send_email.py
and move forward to the next challenge.
Python Fire is a library for automatically generating command line interfaces (CLIs) from absolutely any Python object. We will use fire
to helps us run our send_email.py
on a modular basis to helps us debug the remaining skeleton.
This module is designed to support email blast to multiple recipients at a time. The contacts list is stored on external file under templates/contacts.txt
. Each line will represent 1 contact list with the following format:
person name email
The string stated in before email string will be used to address the person in the email. For example the following contact list:
Team Algoritma [email protected]
will be separated into two variables:
- Name: Team Algoritma
- Email: [email protected]
- The code that support this is stored under
extract_contacts
function defined in the script. Let's usefire
to create a simple debugging script. Take a look at the following script, you can copy and paste it and save it underdebug.py
:
from send_email import extract_contacts
import fire
if __name__ == '__main__':
# Export to Fire
fire.Fire(extract_contacts)
-
This chunk of code is simply exporting
extract_contacts
function accessible in our CLI. Then, you should be able to perform the following command in a CLI:python debug.py --contact_file=templates/contacts.txt
-
If you have set the function properly, it should extracted 2 elements list, one containing all names and one containing all emails. Try to edit the
contacts.txt
stored within templates folder, and change the list into [email protected] and your own email. See if it produce a similar output with the following example:
P.S. make sure that the recipents include:
- Team Algoritma: [email protected]
Now, let's try to check whether you have completed Challenge 1 properly by running the extract_summary
function in your CLI. Try to modify the debug.py
you created on the previous section to make sure it is returning a dictionary like expected:
Extra Tip:
- You only need to change which module you import and export it to
Fire
If the extract_summary
can run properly, you will now be able to compose the email. Try to use the debug_temp.py
script I have included in the repository and run it in your CLI with: python debug_temp.py --name=debugger
It should return NameError: name '___' is not defined
, since you have not complete the compose_email
function in send_email.py
. Try to complete the function, then it should be able to print out the template's variables filled in with the extracted information from the previous steps:
Extra Tips:
- Fill up
START_DATE
,END_DATE
,TOTAL_SPENT
,TOTAL_CONVERSION
oncompose_email
function - Change the
GITHUB_LINK
variable to your own repository!
The SMTP setup is managed within authenticate_account
function. The default email host used is an Outlook server (If you want to use Gmail as the sender address, please read the additional section below).
The smtplib
library will manage SSL authetication for your email address. For security purposes, you will need to set up an environment variables on your local machine called EMAIL_ADDRESS
and EMAIL_PASSWORD
(See Setup: Environment Variable section). This is done to avoid having to hard code your email and password on the script and risking it to be accidentaly shared accross the internet.
To verify your environment variables, run the following python code and see if the proper value is printed out:
import os
print(os.environ['EMAIL_ADDRESS'])
print(os.environ['EMAIL_PASSWORD'])
If each line successfuly printed out on the respective environment, you should be good to go.
- Modify the
send_email.py
: Please change thehost
andSERVER
onauthenticate_account
as below:
def authenticate_account(EMAIL, PASSWORD, SERVER='gmail'):
"""
Authenticate SMTP account for gmail
Other host is not supported
"""
if(SERVER == 'gmail'):
host = 'smtp.gmail.com'
port = 587
else:
raise("Email host is not supported")
s = smtplib.SMTP(host=host, port=port)
s.starttls()
s.login(EMAIL, PASSWORD)
return s
- Allow less secure apps access: By default, Gmail prevent third party credentials by set the less secure apps to OFF. You will need to turn it ON by accessing this link and make sure the setting is turned ON:
Note:
- This setting may not be available for accounts with 2-Step Verification enabled.
The fire
package has been set up to fire main
function when called. If the script has been set up properly, you should be able to call the function from CLI. The parameter passed into the function can be specify using the syntax: --param=value
. You can pass in multiple parameters within one line execution, the available parameters are:
subject
: Provide your email subjectcontact_file
: If you have been paying attention, you will need to change the default value of this parametertemplate_file
: Default set totemplates/body.txt
data_file
: Default set todata_input/data.csv
The bash command should be able to fire the script using the following syntax:
python send_email.py --subject="YOUR SUBJECT"
If all components has been properly set up, you should be able to sent us an auto generated report email to [email protected]
. Your final mission in this capstone project is to send us an email, providing your GitHub link on the email with the exported plot attached. Please send the email with the following details:
Subject:
[BATCH_NAME DA CAPSTONE] Facebook Email Report
For example:
[IRIS DA CAPSTONE] Facebook Email Report
Body Template:
Hi Team Algoritma,
This is an auto-generated email, reporting Facebook ads campaign performance for all listed campaign from ${START_DATE} to ${END_DATE}. The total marketing budget spent on the campaign is ${TOTAL_SPENT} with a total conversion of ${TOTAL_CONVERSION}. The cost per conversion gained for the campaigns is ${CPC}.
Please find the complete script on my Github: ${GITHUB_LINK} Best regards,
We are looking forward for your email!
Good luck and happy coding!