Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

install steps not clear #6

Open
Phizicks opened this issue Apr 25, 2024 · 16 comments
Open

install steps not clear #6

Phizicks opened this issue Apr 25, 2024 · 16 comments

Comments

@Phizicks
Copy link

ollama run fotiecodes/jarvis runs fine and can chat to jarvis but when calling ollama run jarvis:latest this doesn't work.

$ ollama run jarvis:latest
pulling manifest 
Error: pull model manifest: file does not exist

when running this projects code, it fails finding jarvis:latest

...
You said: what time is it
Time requested
11:13 AM
Saying:  {'error': 'Error 404: {"error":"model \'jarvis:latest\' not found, try pulling it first"}'}

which is strange because this worked fine

$ ollama list
NAME                    	ID          	SIZE  	MODIFIED       
fotiecodes/jarvis:latest	e784ce3e1255	3.8 GB	57 minutes ago	

so it's clearly there, just not finding it.

@FotieMConstant
Copy link
Member

Hey there @Phizicks thank you for your feedback on this, I can see where the problem is. I am currently working with a locally created model which is hardcoded. Having that added to an env file would be ideal, thanks for pointing this out between.

should have a fix for this by this evening. If I don’t could you please ping me again on here? Might be caught up and forget to push a fix real quick…

@Phizicks
Copy link
Author

I'm actually trying to find how I can use jarvis's voice directly. I have my own which uses lists of regex matches so you can ask something in multiple ways for it to recognise the intent. I also have a working tts for linux but it's not using jarvis voice, yet. so any tips on main parts for it to work would be great.

@FotieMConstant
Copy link
Member

FotieMConstant commented Apr 25, 2024

you can use the default system's tts, at the moment I use elevenlab's tts, with a clone of the Jarvis voice I made. but if you wanna have it working with the local tts you can always update the text_to_speech.py file in the modules file. of course this won't be the case into he future as the goal is to have a homogeneous tts engine that runs fully locally, sounds and feels like Jarvis from the mcu.

in the file, you can comment out the second try: block of code with it's Exception and uncomment the first part.
Please equally note that this hasn't been tested on any other system then MacOS 14.x

@FotieMConstant
Copy link
Member

@Phizicks I am done with the tweak I mentioned earlier, pushing the fix shortly:)

@FotieMConstant
Copy link
Member

hey @Phizicks could you please pull and try again, this should work just fine now.

Please let me know:)

@Phizicks
Copy link
Author

I'm still having to modify this code to work on my ubuntu linux. missing opencv module 'opencv-python==4.9.0.80' which I added to requirements.txt but also I think my elevenlabs module being latest, has changes which don't reflect on elevenlabs website, bit weird.

now I seem to run into issues regarding the call to generate() with error

elevenlabs.core.api_error.ApiError: status_code: 400, body: {'detail': {'status': 'voice_not_found', 'message': 'A voice for the voice_id xFjhlCVIoEAjDeZpAmFe was not found.'}}

which is triggered on play(audio) in text_to_speech.py
I think this is because of your custom jarvis voice xFjhlCVIoEAjDeZpAmFe so I probably have to change this since I don't have that.

@FotieMConstant
Copy link
Member

FotieMConstant commented Apr 25, 2024

yeah, I just checked and unfortunately I need to connect my strip account to be able to share the clone I did on elevenlabs Voice Library. For that I need a business address etc in a strip supported country. which I do not have at the moment. apologies.

you can always use other voices:) and I will start working in-depth on the TTS engine when I am done with significant features.

@FotieMConstant
Copy link
Member

@Phizicks could you manage to get the code running none the less?

@brainiakk
Copy link

Where did you get the Jarvis audio you cloned your's from @FotieMConstant Open voice v2 is out. I already got something that sounds like Jarvis but I think a better audio for the reference would produce better results.

@FotieMConstant
Copy link
Member

FotieMConstant commented Apr 25, 2024

that's amazing @brainiakk, perhaps you could help by contributing on the TTS module, the objective is to have it work 100% offline. I am currently caught up with model training, dataset collection etc. giving less time to focus on other things... essentially working on making the model more Jarvis like and reduce hallucinations.

regarding the voice dataset, after doing some in-depth scrapping on the web I collided and added the voice dataset to kaggle.

here: https://www.kaggle.com/datasets/fotiemconstant/jarvis-dataset

@Phizicks
Copy link
Author

the code runs but just the TTS is failing.

This code cannot run offline until the voice recognition is changed as it uses google
https://github.com/clevaway/J.A.R.V.I.S/blob/main/modules/speech_to_text.py#L20
I tried most if not all OSS voice recognitions and they don't translate too well, not well enough to be 100% offline.

anyway, I give up on this for now.

@FotieMConstant
Copy link
Member

@Phizicks Does it work fine with internet connectivity?

also if you noticed I am currently doing my best to integrate whisper. So everything speech to text works offline, but it might take some time as I have a lot on the table at the moment. Thanks for the feedback by the way!

@FotieMConstant
Copy link
Member

@Phizicks regarding the TTS, could you provide me with the error log you are getting?

@brainiakk
Copy link

Thanks @FotieMConstant

@Phizicks
Copy link
Author

Listening...
You said: what time is it
Time requested
06:09 PM
Saying:   It is 06:09 PM in my time zone. How may I assist you?
Error during text-to-speech: status_code: 400, body: {'detail': {'status': 'voice_not_found', 'message': 'A voice for the voice_id xFjhlCVIoEAjDeZpAmFe was not found.'}}

I also had to add an exception for the timeouts waiting for instructions

--- a/modules/speech_to_text.py
+++ b/modules/speech_to_text.py
@@ -20,7 +20,9 @@ class SpeechToText:
                 text = self.recognizer.recognize_google(audio)
                 print("You said:", text.lower())
                 return text.lower()
-
+            except sr.WaitTimeoutError:
+                print("listening timeout")
+                return ''
             except sr.UnknownValueError:
                 print("Sorry, could not understand audio.")
                 return "Hey there"

@FotieMConstant
Copy link
Member

Listening...
You said: what time is it
Time requested
06:09 PM
Saying:   It is 06:09 PM in my time zone. How may I assist you?
Error during text-to-speech: status_code: 400, body: {'detail': {'status': 'voice_not_found', 'message': 'A voice for the voice_id xFjhlCVIoEAjDeZpAmFe was not found.'}}

I also had to add an exception for the timeouts waiting for instructions

--- a/modules/speech_to_text.py
+++ b/modules/speech_to_text.py
@@ -20,7 +20,9 @@ class SpeechToText:
                 text = self.recognizer.recognize_google(audio)
                 print("You said:", text.lower())
                 return text.lower()
-
+            except sr.WaitTimeoutError:
+                print("listening timeout")
+                return ''
             except sr.UnknownValueError:
                 print("Sorry, could not understand audio.")
                 return "Hey there"

From the above I can see you are having issues with the elevenlabs voice id, could you replace the “ xFjhlCVIoEAjDeZpAmFe” with a voice from the elevenlabs voice library: https://elevenlabs.io/app/voice-library

might need to create an account

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants