[APP][Pro] OpenAI ChatGPT

This is a thread for the new app to communicate with the OpenAI Chatbot

The source code is here: GitHub - frodeheg/OpenAI

3 Likes

The app has been sent for approval, will probably be public in a few days.

3 Likes

Hi,
very interesting. Could you please write some short words about the usage?
Is it meant to ask single questions? Or a complete conversion?
I saw a ‘forget conversation’ in your code. So annoverview about your implemented usecases would ve very helpful. Many thanks.

1 Like

The intended behavior I wanted when I made the app was to be able to build a system where my children could talk to some input device, get the speech converted to text and sent to homey, then homey could forward it to ChatGPT and play the result in the Homey speaker using text-to-speech.

As I wanted to keep the dialogue the chatbot need to remember something. I made it so that it remembers everything until it has lapsed 10 minutes without any chatting, then it went back to default.

Though, I’ll probably make this configurable… and there might be another way to remember things. I see they have options to train the bot, so I might use that instead when I find out how it works…

Nevertheless, I still need a speech-to-text interface that can send something to homey before my intended use case is met, do you have any good suggestions?

3 Likes

I don’t know a possibility to get spech to text. I would like to use Alexa for such things. Instead of controlling devices, a flow trigger for speach/text input would be cool. Or let Alexa say something, ask for answer and get this answer back as text into Homey. But that’s not possible with the current Alexa API I think.

The closest I have gotten to a solution is that Google Assistant is able to send an email to a mail address I created for homey. Then I let Homey check that email and send the result to the chatbot.

The downside is that it introduces a bit of delay.

I could possibly add a Minimal mail-server to this app that does nothing else than to forward a message to ChatGPT though… That would remove the need to poll another mail server and as such remove the delay. (however, I am a bit skeptical about this solution because it will probably require you to configure your firewall to forward SMTP requests to homey because google assistant mail sending probably goes through the gmail-servers)

Another option I see is that Google Assistant has the ability to broadcast a message to other smart speakers. If I just had a Homey app that pretend it is a smart speaker that can receive these broadcast messages…

1 Like

I don’t know if you are using Apple devices, but I think you should be able to make this work with the help of “Shortcuts” automations.

For example: Creating Shortcuts That Accept Voice Input - YouTube

1 Like

Thanks. I investigated if Google could do the same and ChatGPT lead me to the following:

So seems like I have something to investigate this weekend :slight_smile:
EDIT: It was a bit complicated, not sure if I want to use much time on it.

1 Like

For Google I was able to solve it the following way:

  • Create a Routine (partial equivalent of “shortcut automation”) that let me say “Hey Google, Chatbot” which autostarts a flow sending an email to my mail address, then I can dictate the message and say send.
  • Then on my email server I have an option to direct incoming mails to a program, in which case I sent it to curl that forwarded it in a POST message to a webhook I created on Homey.

Thus once Athom has approved this application I will send out a follow-up version that will allow you to talk to ChatGPT through a webhook in addition to using flows. I suppose someone else might find it helpful too.

1 Like

And it’s live :slight_smile:

If there are any issues please let me know.

5 Likes

Hey all,

Apparently there is a bug that if you change the API key in the settings it will not be properly applied unless you restart the app after you saved the API key…

I didn’t notice because I have had the key stored there all the time so all debugging has been considered as a restart for me…

So, I’ll fix this later this evening, until then, just restart the app.

Sorry about the bad start :woozy_face:

2 Likes

Fixed in version 1.1.8.

1 Like

As it’s part of the same API, I just had to expose a flow to generate and return Images from text using DALL·E. The new flow can generate images with size 256x256, 512x512 or 1024x1024.

(once version 1.2 has been approved… can’t wait? try the test version)

The API also have an Image Edit option that lets you upload an image + a mask and then a text that describes what DALL·E should fill in where the mask was. I am sure it could be fun to put on party-hats and funny glasses on the person ringing on the door bell and display it on a screen when they enter the house… so maybe I’ll add support for that as well later.

A third option they have is to input an image and output a variation of it. If anyone wants a flow for it I’ll add it.

More information of how it works can be found here: OpenAI API

1 Like

I just updated the test version where the webhook now trigger a flow-trigger instead of going directly to chatGPT. This will allow you to use the webhook for a more general purpose. At the same time I allowed you to add a flag to the webhook to be able to classify the content. So… in my situation, when talking with google assistant I have two different automations, one that sends text to chatGPT and one that sends text to DALLE to create an image, I just append &flag=image to the webhook and handle it like this: (sorry it’s in Norwegian, but you’ll get the picture… )

So I tried the flow above with my kids. They say “Draw an image of a monkey on a bike”, google assistant emails it to my mailserver, which pushes it to the webhook and as such homey sends it to DALLE for creating an image that is eventually casted to the television screen. The kids loved it :grin:

7 Likes

Someone sent me a diagnostics report that didn’t really contain any indications of anything that was wrong. I would gladly help or fix any issues if there was a bug but I couldn’t really see any issues in that report. Please reach out in this forum if you need help and I will assist as best as I can.

Thanks for the app @frodeheg , this is great stuff. I just wanted to share how I integrated this into my smart home that I found very exciting.

I really want to control my smart home using natural language, and LLMs is the way to go nowadays. I have also struggled to find a good way to integrate Google Assistant voice commands, but I made use of the Telegram Bot app for Homey that allows chats to the bot to be picked up by a Homey flow, so I then have a chat interface with Homey. I can then hook that up to you OpenAI app and I can start prompt engineering :slight_smile:

The flow I have for this as a prototype is basically
[Bot received message]->[Homey script fetches all devices with capability values]->[OpenAI prompt with devices as context and a question about the devices]->[Answer sent back to telegram as response]

This allows me to ask questions about my devices and get a natural language answer back :slight_smile:
image

It does require quite a big promt, so hopefully when I get access to GPT-4 that would be simpler. Are you able to add that as a model option maybe?

Next step for this project would be to classify questions into “queries” vs. “commands”, and produce a promt that could emit what capabilities should be update with new values so that it can handle chat input like “Turn off the lights in the living room.”

For this to be effective though, I might want to have specialized chats and control the chat history more granularly. Is it possible to look into supporting separate devices for OpenAI that would act as separate chat “bots” with different settings to optimize this type of pattern?

6 Likes

Cool setup @tokreutz

Interesting, I have not tried that yet. Do you recommend it to be included as a suggested flow setup in the app settings?

Please be aware that the context cannot be greater than 4500 characters or something like that (unless they changed it), so if your input is longer than that the beginning of it will be lost (the app just caps the oldest part of the conversation)

Yes, I created this ticket to follow up: Add support for GPT-4 ¡ Issue #1 ¡ frodeheg/OpenAI ¡ GitHub

Interesting thought! This would probably be nice in order to save money too as the cheaper bots may be good enough for classification, while the better bots are used to handle the speech part. I created this ticket to follow up: Add support for different contexts ¡ Issue #2 ¡ frodeheg/OpenAI ¡ GitHub

I am a bit overloaded so I can’t promise this second ticket very fast, but I like the idea so I would certainly look into it when I get time.

1 Like

If you mean in the examples under Help, that sounds useful.

I’m not familiar with how the openai API works, but I know when using the openai playground, the limit is 4001 tokens, and not characters. One token is roughly 4 characters.

The current limit is making it hard to cram all my devices in the prompt as it is now. I try to be clever and ask GPT to analyze my question to identify which device capabilities might be relevant to the question, and then filter away not needed devices to make the promt smaller, but that makes it harder to ask more general questions that do not ask for a particular set of capabilities…

Exactly. And they could have separate “system promts”, to tailor them to their specific task :slight_smile:

No worries. Maybe I can help out if I can find some time too :slight_smile:

1 Like

Would it be possible to start from homey with a question to chatgpt. Maybe like this: Make a advanced flow, start every monday and then make a timeline notification with the text successful?

I am from the old generation and cannot get used to talking to a ‘thing’ :shushing_face:

Great, please let me know if you want to take on some task so we don’t do any double work. :slight_smile:

No, that would not be possible. As far as I know it is impossible for an app to create an advanced flow.
It is possible for an app to perform actions on its own and communicate with devices though, but If you would want this app to do that based on oral commands then I would have to implement the actions directly.