Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More sophisticated functionality for generating text from images and apply prompts to images #27

Merged
merged 7 commits into from
Mar 6, 2024

Conversation

C-Loftus
Copy link
Owner

@C-Loftus C-Loftus commented Mar 3, 2024

Describe images using custom prompts and more sophisticated TTS integration for helping users who are blind or for those that do lots of UI design

  • Have to figure out how to convert files on the clipboard to squares since the openai api only works on square images
  • Have to figure out file uploading and deleting since applying edits to the image needs to be done on stored images. Want to make sure that storing images isn't going to cause weird issues with billing

@C-Loftus C-Loftus changed the title Sophisticated functionality for generating text from images and apply prompts to images More sophisticated functionality for generating text from images and apply prompts to images Mar 3, 2024

-

structure: Output nothing but your best approximation for the raw semantic HTML without any styling that could be used to create the user interface in the following image. Do not include any boilerplate HTML.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love this 😀

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also like to see a prompt to generate some reasonable component names (I.e. if you were to build this in react, how might you break it up?)

@C-Loftus
Copy link
Owner Author

C-Loftus commented Mar 5, 2024

@jaresty When you get a chance maybe you could test this out. I want to improve the CSS styling on the image output description. If you have any bandwidth, improving the CSS for the HTML builder would be the main thing I would like to improve. I have moved it to a new file.

Essentially image description works pretty well and we have two new settings for whether or not we want to open the description in a new webpage and how much content we wanted to describe back.

I think we have most of the functionality here just thinking about improving UX

I think image generation (beyond simple prompting) or any sort of image editing is not really worthwhile at the moment since it only returns square images and we also have to worry about managing file uploads to openai which is sort of beyond the scope of this PR. Image generation really needs a more proper UI/GUI to be done well I think

@C-Loftus
Copy link
Owner Author

C-Loftus commented Mar 6, 2024

Think the css should be fixed. If this looks good, it should be ready to merge. UX can continue to be improved, but it is generally satisfactory and good to iterate on

@C-Loftus C-Loftus merged commit a0498fc into main Mar 6, 2024
2 checks passed
@C-Loftus C-Loftus deleted the imageChanges branch March 6, 2024 19:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants