AI infused note taking application written in Ruby on Rails. This hacks the Trix editor to add AI features such as generating text, text to image and transcribing audio. It makes heavy use of asynchronous processing where changes (AI generated slop) are pushed to the client over WebSockets.
This side project was started in order to learn the latest features of RoR, Hotwire and AI.
- Rich text editor
- Text to image (stable diffusion)
- Text to text (llm)
- Conversational LLM interface
- Speech to text (transcribe audio files)
- Transcription summaries
- Upscale images
- Image to video
- Video playback
- PDF rendering
- Export
- Sharing
- Streaming LLM responses
For development, use docker compose. This will use Dockerfile.dev
.
This orchestrates the following services:
- database: Postgres v16
- redis: Redis v7
- ws: Anycable WebSocket server
- anycable: Anycable gRPC server
- web: Ruby on Rails application
- sidekiq: Sidekiq background job process
- chrome: Browserless chrome for running feature/system tests.
docker compose build
You can run all services with
docker compose up
Or just what you need (ie, without sidekiq, chrome, etc)
docker compose up web
- docker compose exec -it web bash
- docker compose exec -it database psql -U postgres
Secrets are supplied to the application using Custom Credentials
and a ENV vars (eg, .env). Anycable needs RAILS_MASTER_KEY
so the anycable
service also takes a .anycable.env
env_file
with only that variable set. In
development, setting the RAILS_MASTER_KEY
in the .env
will break the test
env unless they're the same key. See also #secrets.
This depends on AWS services. A user will need to be created in the AWS portal with the following permissions:
- AmazonS3FullAccess
- AmazonTranscribeFullAccess
- AmazonBedrockFullAccess The user's access key and secret needs to be added to the applications credentials.
See also [#activestorage-configuration](ActiveStorage Configuration)
This uses AWS Transcription Jobs for speech to text
LLM used for summaries See also https://aws.amazon.com/bedrock/titan/
LLM used generative text features See also https://www.anthropic.com/claude
Text to Image https://stability.ai/
At the time of writing this, Devise and turbo streams have some compatability
issues, which can be resoled swept under the rug by disabling turbo in the forms using a HTML data
attribute data: { turbo: false }
.
There is an alternative to make turbo work with the devise forms, but involves some customization to devise that are require more advanced understanding of devise configuration, and probably not worth it.
Authentication with session cookies is the Devise default and is used for same origin web requests.
Endpoints that respond to json
format are authenticated with JWT tokens.
(unless the request is same origin, in which case Devise will authenticate using
the session cookie.)
See also devise-jwt docs:
If the authentication succeeds, a JWT token is dispatched to the client in the Authorization response header, with format Bearer #{token} (tokens are also dispatched on a successful sign up).
jti
(JWT ID). See https://github.com/waiting-for-dev/devise-jwt#revocation-strategies
Sending a DELETE
to users/sign_out.json
will revoke the token via the jti.
Multi provider authentication is provided by omniauth. See also https://github.com/heartcombo/devise/wiki/OmniAuth:-Overview for instructions on how to add an auth provider.
This only works in development for now, which is setup in Github as tmp-dev
and calls back to localhost:3000
.
For future auth providers that do not support callbacks to localhost
, use a tunnel service like ngrok
that supports having a fixed domain.
ngrok http --domain=titmouse-charming-correctly.ngrok-free.app 3000
This application stores encrypted credentials per the Custom Credentials Rails convention.
- Generate a secret
bundle exec rake secret
- Add it to the environment's secrets:
bin/rails credentials:edit --environment development
Adding and updating keys requires having a key (not in source control) for a particular environment. The keys are
- config/credentials/development.key
- config/credentials/production.key
- config/credentials/test.key
- config/master.key
Front end built with turbo and stimulus.
See also https://notes.alex-miller.co/20231125150622-turbo_streams/
- Live reloading in development is handled by hotwire-livereload
- Convenience functions for behaviors added to Stimulus with stimulus-use
This uses the view_component library. Why? Produce views using POROs, thereby making that which was implicit, explicit. Easire to test. See also
In development, visit http://localhost:3000/rails/view_components/ for the index of available previews. Note that previews are used for the component system tests.
See also https://viewcomponent.org/guide/previews.html
- Avoid Deeply Nested Component Trees
- Stick to the Single-Responsibility Principle
- Avoid Making Database Queries Inside Components
- Use Context to Pass Global State
- Test the public interface of the component, the template
See also
- Custom colors override the Bootstrap defaults defined in the
$color-theme
map, as well as define new ones. These colors are used to automatically define utility classes (eg.bg-accent1
). See maps and loops - See https://huemint.com/bootstrap-plus/ for quickly testing out color swatches.
Uses Amazon s3 buckets per environment. Buckets have CORS configuration to support direct uploads. See also ActiveStorage Guide
Uses Amazon s3 bucket for development: apm-tmp-development
TBD
Websockets are handled by ActionCable and Anycable.
- ActionCable provides the framework for defining application business logic for handling how connections are authenticated, how messages are responded to and what events should trigger messages to be sent to which clients (eg, channels and subscribers).
- Anycable provides the implementation of WebSocket connection management, which entails a WebSocket server separate from the web application and an RPC server for executing application code. This depends on Redis' pub/sub mechanism. There's a lot of moving parts here and things can go wrong. See troubleshooting
- Live reloading in development is handled by hotwire-livereload which depends on a WS connection.
- Background jobs are handled by sidekiq free version
- Unique job constraints, normally a sidekiq ENT feature, are provided by the sidekiq-uniqu-jobs gem
- Sidekiq UI is mounted at
/sidekiq
. Use must be authenticated and a developer in order to view this.
See also https://rspec.info/documentation/
For specs that require authentication, there are a few options:
- Use Devise's IntegrationHelpers (eg in request specs)
- For feature tests, use
LoginHelper#login
to log in the user. This will login in the user with username and password on the sign in page. - For JSON format request specs, use the
auth_headers
helper to perform a login and retrieve theAuthorization
header:
get "/users/#{user.id}.json", headers: auth_headers(user)
Use custom RSpec matcher have_turbo_stream
in request specs. It is a wrapper for assert_turbo_stream and accepts the same arguments.
it { is_expected.to have_turbo_stream(action: 'prepend', target: 'messages') }
System tests are driven by Cuprite, a Capybara driver See also
- https://evilmartians.com/chronicles/system-of-a-test-setting-up-end-to-end-rails-testing
- https://vtc.hatenablog.com/entry/2022/02/26/175431 (giving cuprite a try using a basic Rack app)
This Uses browserless' Chrome image
Running system tests
- To start chrome, run
docker compose up -d chrome
- Run
rspec spec/system
on the web container - visit http://localhost:3333/
CI run on Github Actions. The following actions comprise the CI pipeline:
- rspec tests
TBD