From b1f8e4ede9b3ae4c7e34556d19a6d70dbec97344 Mon Sep 17 00:00:00 2001
From: iulusoy Due to well documented biases in the detection of minorities with computer vision tools, and to the ethical implications of such detection, these parts of the tool are not directly made available to users. To access these capabilities, users must first agree with a ethical disclosure statement that reads: “DeepFace and RetinaFace provide wrappers to trained models in face recognition and emotion detection. Age, gender and race/ethnicity models were trained on the backbone of VGG-Face with transfer learning.
-ETHICAL DISCLOSURE STATEMENT:
-The Emotion Detector uses DeepFace and RetinaFace to probabilistically assess the gender, age and race of the detected faces. Such assessments may not reflect how the individuals identify. Additionally, the classification is carried out in simplistic categories and contains only the most basic classes (for example, “male” and “female” for gender, and seven non-overlapping categories for ethnicity). To access these probabilistic assessments, you must therefore agree with the following statement: “I understand the ethical and privacy implications such assessments have for the interpretation of the results and that this analysis may result in personal and possibly sensitive data, and I wish to proceed.” “DeepFace and RetinaFace provide wrappers to trained models in face recognition and emotion detection. Age, gender and race/ethnicity models were trained on the backbone of VGG-Face with transfer learning. ETHICAL DISCLOSURE STATEMENT: The Emotion Detector uses DeepFace and RetinaFace to probabilistically assess the gender, age and race of the detected faces. Such assessments may not reflect how the individuals identify. Additionally, the classification is carried out in simplistic categories and contains only the most basic classes (for example, “male” and “female” for gender, and seven non-overlapping categories for ethnicity). To access these probabilistic assessments, you must therefore agree with the following statement: “I understand the ethical and privacy implications such assessments have for the interpretation of the results and that this analysis may result in personal and possibly sensitive data, and I wish to proceed.” This disclosure statement is included as a separate line of code early in the flow of the Emotion Detector. Once the user has agreed with the statement, further data analyses will also include these assessments. The different detector modules with their options are explained in more detail in this section. ## Text detector Text on the images can be extracted using the The user can set if the text should be further summarized, and analyzed for sentiment and named entity recognition, by setting the keyword Please note that for the Google Cloud Vision API (the TextDetector class) you need to set a key in order to process the images. This key is ideally set as an environment variable using for example The This module is based on the LAVIS library. Since the models can be quite large, an initial object is created which will load the necessary models into RAM/VRAM and then use them in the analysis. The user can specify the type of analysis to be performed using the The implemented models are listed below. Faces and facial expressions are detected and analyzed using the Depending on the features found on the image, the face detection module returns a different analysis content: If no faces are found on the image, all further steps are skipped and the result This will install the package and its dependencies locally. If after installation you get some errors when running some modules, please follow the instructions in the FAQ. This will install the package and its dependencies locally. If after installation you get some errors when running some modules, please follow the instructions in the FAQ. The main demonstration notebook can be found in the The main demonstration notebook can be found in the There are further sample notebooks in the Topic analysis: Use the notebook To crop social media posts use the yX2T<)OSk#bp{ngUyR5vyG*=*x
zC)$
A
z>9V+KBMdWs8Yiz3^(o^26EaHVwI&1PohHLWUT89yyv-mKR~Z!I9+MF$FEQ8>H<%1}
za|K`i4iv%y-uMnI#jUXPU3iv3nZ0fkgvTU^%hE)vjEX{BlTe5|5(*I&g@}qmT#ZnO
zs3
#$}>})k?Y!u!S<=c6|XxeZKkLeIH4-$!cLPBwKBk
zsu4uGADbafprEsn;C`1Z1GF(uYQ6hLXl
diff --git a/build/html/create_API_key_link.html b/build/html/create_API_key_link.html
index e782159f..6bef995b 100644
--- a/build/html/create_API_key_link.html
+++ b/build/html/create_API_key_link.html
@@ -6,14 +6,14 @@
-
diff --git a/build/html/faq_link.html b/build/html/faq_link.html
index ed37d838..267809c0 100644
--- a/build/html/faq_link.html
+++ b/build/html/faq_link.html
@@ -6,14 +6,14 @@
-
@@ -200,9 +200,9 @@ What happens if I don’t have internet access - can I still use ammico?
Why don’t I get probabilistic assessments of age, gender and race when running the Emotion Detector?
diff --git a/build/html/index.html b/build/html/index.html
index f7c8eba4..047b0981 100644
--- a/build/html/index.html
+++ b/build/html/index.html
@@ -6,14 +6,14 @@
-
diff --git a/build/html/license_link.html b/build/html/license_link.html
index b0a73e2b..8842259c 100644
--- a/build/html/license_link.html
+++ b/build/html/license_link.html
@@ -6,14 +6,14 @@
-
diff --git a/build/html/modules.html b/build/html/modules.html
index fbb1c141..1bb3125e 100644
--- a/build/html/modules.html
+++ b/build/html/modules.html
@@ -6,14 +6,14 @@
-
diff --git a/build/html/notebooks/DemoNotebook_ammico.html b/build/html/notebooks/DemoNotebook_ammico.html
index d0782cd4..8fe40b6a 100644
--- a/build/html/notebooks/DemoNotebook_ammico.html
+++ b/build/html/notebooks/DemoNotebook_ammico.html
@@ -6,7 +6,7 @@
-
@@ -523,7 +523,7 @@ Read in a csv file containing text and translating/analysing the text
The detector modules
TextDetector
class (text
module). The text is initally extracted using the Google Cloud Vision API and then translated into English with googletrans. The translated text is cleaned of whitespace, linebreaks, and numbers using Python syntax and spaCy.analyse_text
to True
(the default is False
). If set, the transformers pipeline is used for each of these tasks, with the default models as of 03/2023. Other models can be selected by setting the optional keyword model_names
to a list of selected models, on for each task:
model_names=["sshleifer/distilbart-cnn-12-6", "distilbert-base-uncased-finetuned-sst-2-english", "dbmdz/bert-large-cased-finetuned-conll03-english"]
for summary, sentiment, and ner. To be even more specific, revision numbers can also be selected by specifying the optional keyword revision_numbers
to a list of revision numbers for each model, for example revision_numbers=["a4f8f3e", "af0f99b", "f2482bf"]
.The detector modules
Image summary and query
SummaryDetector
can be used to generate image captions (summary
) as well as visual question answering (VQA
).analysis_type
keyword. Setting it to summary
will generate a caption (summary), questions
will prepare answers (VQA) to a list of questions as set by the user,
summary_and_questions
will do both. Note that the desired analysis type needs to be set here in the initialization of the detector object, and not when running the analysis for each image; the same holds true for the selected model.BLIP2 models
Detection of faces and facial expression analysis
EmotionDetector
class from the faces
module. Initially, it is detected if faces are present on the image using RetinaFace, followed by analysis if face masks are worn (Face-Mask-Detection). The probabilistic detection of age, gender, race, and emotions is carried out with deepface, but only if the disclosure statement has been accepted (see above)."face": "No", "multiple_faces": "No", "no_faces": 0, "wears_mask": ["No"], "age": [None], "gender": [None], "race": [None], "emotion": [None], "emotion (category)": [None]
is returned. If one or several faces are found, up to three faces are analyzed if they are partially concealed by a face mask. If
yes, only age and gender are detected; if no, also race, emotion, and dominant emotion are detected. In case of the latter, the output could look like this: "face": "Yes", "multiple_faces": "Yes", "no_faces": 2, "wears_mask": ["No", "No"], "age": [27, 28], "gender": ["Man", "Man"], "race": ["asian", None], "emotion": ["angry", "neutral"], "emotion (category)": ["Negative", "Neutral"]
, where for the two faces that are detected (given by no_faces
), some of the values are returned as a list
with the first item for the first (largest) face and the second item for the second (smaller) face (for example, "emotion"
returns a list ["angry", "neutral"]
signifying the first face expressing anger, and the second face having a neutral expression).
diff --git a/build/html/py-modindex.html b/build/html/py-modindex.html
index 6445c5d1..772f90c8 100644
--- a/build/html/py-modindex.html
+++ b/build/html/py-modindex.html
@@ -5,14 +5,14 @@
-
@@ -147,25 +147,25 @@ Installation
pip install ammico
Usage
-notebooks
folder and also on google colab: [].notebooks
folder and also on google colab: .notebooks
folder for the more experimental features:
get-text-from-image.ipynb
to analyse the topics of the extraced text.
-You can run this notebook on google colab: [
+You can run this notebook on google colab:
Place the data files and google cloud vision API key in your google drive to access the data.cropposts.ipynb
notebook.
-You can run this notebook on google colab: [
The text is extracted from the images using google-cloud-vision. For this, you need an API key. Set up your google account following the instructions on the google Vision AI website or as described here. +
The text is extracted from the images using google-cloud-vision. For this, you need an API key. Set up your google account following the instructions on the google Vision AI website or as described here. You then need to export the location of the API key as an environment variable:
export GOOGLE_APPLICATION_CREDENTIALS="location of your .json"