Skip to content

Commit

Permalink
Add laser clustering example notebook
Browse files Browse the repository at this point in the history
Paulooh007 committed Dec 1, 2023
1 parent 995c2f7 commit a934ac7
Showing 2 changed files with 6,120 additions and 0 deletions.
6,107 changes: 6,107 additions & 0 deletions tasks/clustering/LaserClusteringExample.ipynb

Large diffs are not rendered by default.

13 changes: 13 additions & 0 deletions tasks/clustering/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Laser Encoder: Sentiment Analysis

## Overview

In this tutorial, we'll explore the power of Language-Agnostic SEntence Representations ([LASER](https://github.com/facebookresearch/LASER)) for generating multilingual embeddings. We'll then use these embeddings to perform clustering on the [MASSIVE](https://github.com/alexa/massive) dataset. Our goal was to show that LASER embeddings can effectively group texts not only by their thematic content but also across different languages. LASER can encode sentences from multiple languages into a shared embedding space, allowing for cross-lingual understanding and comparison. We'll see how this capability is useful for tasks like multilingual embeddings clustering.

## Getting Started

To run the notebook in Google Colab, simply click the "Open in Colab" button below:

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Paulooh007/LASER-fork/blob/laser-clustering/tasks/clustering/LaserClusteringExample.ipynb)


0 comments on commit a934ac7

Please sign in to comment.