A2Z-F24 Week 6

link to website: https://trumpini-bot.replit.app/

I want to make a bot that responds without text, using images or sound. Inspired by the dialogues in video games like "Don’t Starve Together" and ‘animal crossing’, I decided to create a trumpet bot that responds to messages with trumpet sounds. For a quick prototype, I used AI-generated sounds (though I’m unsure if this is the best choice, as explained in the blog below).

https://www.youtube.com/watch?v=Yz9LPwsO9x0

https://www.youtube.com/watch?v=S7-njGsKmTI

Most bot build their character through content (like Twitter bots). Since I have less control over the AI bot’s output, I focused on creating a strong image by giving it an illustrated avatar to express a chill jazz vibe. I plan to showcase it on a website where I have more control over the visuals instead of using a platform.

For the code, I used an example by Max Bittier. I added a hidden prompt that combines with the user's input to generate the trumpet sound using musicgen on replicate. link to code

The main problem I encountered is the harsh cut at the end of the sound. A possible solution might be a better prompt or training my own model.

I’ve realized that I’m trying to avoid the “perfect AI quality” and instead aim for a more rustic, handmade style in this bot. Knowing the sound is AI-generated seems to take away some of its magic and charm. Using a music generation model is the easiest solution, but it may not be the best.

Other possible solutions could include generating a music score and playing it manually.

Further ideas

bot is thinking…
make bots with different instruments that talk to each other or create a cat bot.
desktop bot: https://www.youtube.com/watch?v=ML743nrkMHw browser extensions like shiffbot
prompt engineering: pages of system prompt, here’re the 10 example…

finetuning the music gen

transformer.js

audio analysis to transform voice into tone

speech cloning like 11lab to train the own voice

human voice training: “assembly voice”