Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider using DictVectorizer and semi-supervised learning to see if any generalizations arise from using a neural network. Review contrastive loss and ideas here #44

Open
Shuyib opened this issue Feb 16, 2023 · 1 comment
Assignees

Comments

@Shuyib
Copy link
Owner

Shuyib commented Feb 16, 2023

The dictVectorizer will not work so well. We have variable lengths of the sequences. Therefore, embeddings have an argument padding in order to make the sequences of the same length.

This workflow allows use to make a representation of the data with dictionary structure that is, an embedding. Which we can use for the semi-supervised or unsupervised methods. Which we can use loss functions like contrastive loss to examine similarity and differences.

@Shuyib Shuyib converted this from a draft issue Feb 16, 2023
@Shuyib
Copy link
Owner Author

Shuyib commented Feb 16, 2023

Seems this was repeated. I'll try make a naive version so that we can build upon this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

No branches or pull requests

2 participants