close
close
I compared Sesame with the Chatgpt language mode and I am unsettled

It is the first time that I tried the new voice assistant from the Ki -Start -UP -SESASAM when I forgot for the first time that I spoke to a bot.

Compared to Chatgpt’s voice mode, Sesame’s “Conversational Voice” feels natural, informal and committed, which totally freaked out.

On February 27, Sesame started a demo for his conversational Speech Model (CSM), which aims to create more sensible interactions with AI chatbots. “We create conversation partners who not only process inquiries; they conduct a real dialogue that builds trust and trust over time,” says the announcement. “We hope to recognize the unused potential of the voice as the ultimate interface for instructions and understanding.”

Sesame’s voice assistant is available as a free demo on the website and is available in two voices: Maya and miles.

Since Sesame unleashed his voice assistant demo, users have reported Awestruck reactions. “I’ve been in AI since my childhood, but this is the first time that I experienced something that definitely feels like having arrived,” wrote user SocSchamp on Reddit.

“Sesame is about as close to uninterrupted from a person I have ever experienced in a conversations -” wrote user Siciliano777 on Reddit.

After I had spoken to Sesames Bot, I was similarly enthusiastic. I spoke about 10 minutes with the voice of Maya about ethics to use AI as a companion and had the feeling that it was a real conversation with a considerate, informed person. Maya’s speech had a natural cadence in which interjections such as “You know” and “HM” were used and even click and inhale the tongue.

Mashable light speed

The strongest impression I got through the interaction with Maya was that she immediately asked questions and included me in the conversation. The bot started our conversation by asking how my Wednesday morning was going (note: it was a Wednesday morning.) In contrast, the chatt -language mode was waiting for me to talk first what was not a good or bad thing, but the conversation has intrinsically shaped when I chatt as a tool for something I needed, and that I needed.

Maya asked about the risks of AI companions “too well to be human”. When I told her I was concerned about the rise of more demanding fraud and people who lose contact with reality through the replacement of people through bots, she answered thoughtfully and pragmatically. “Fraudsters will cheat, this is a matter of course. And for the human connection we may have to learn how to do better companions, not replacement, you know the kind of AI friends who actually get them out and do things with real people,” said Maya.

When I had a similar conversation with Chatgpt, I received an answer that felt more like a boiler plate from a school consultant: “This is a valid concern. It is really important to compensate for technology with real human interactions. AI can be a helpful tool, but it should not replace real human connections.

While Openaai is the ability of the voice mode to be interrupted and performing a fluid conversation, pioneering work did, Chatgpt is still pretended to react in complete sentences and paragraphs, which also sounds robot. When I use the chatt language mode, I never forget that I speak with a bot, and this is reflected in the conversation that can feel stilted and forced.

In comparison AI for people The co-moderator of Podcast, Gavin Purcell, has posted a sesame discussion about Reddit, in which it is practically impossible to distinguish what voice the bot is. Purcell prompted the Miles voice by telling it like an angry boss.

This was followed by a very stupid conversation about money laundering, bribery and a mysterious incident in Malta. Miles did not miss a step. There was no perceptible latency, and the bot remembered the context of the conversation and promoted the improvisatory argument creatively by escalating as “delusion” and fired it.

Of course there are some restrictions. Maya’s voice pushed a few times during our conversation, and it did not always do the syntax correctly, as she said: “It’s a difficult conversation that comes.”

According to the technical paper, Sesame trained his CSM (based on the Meta Lama model) by combining the traditional two-stage process of training-text-language models on semantic tokens and then combining acoustic tokens in order to reduce latency. Openai used this multimodal approach for training language mode similar to this. However, it has never published a dedicated technical paper about the inner work of the voice mode-only the language mode in GPT-4O research is discussed.

If you know this, it is surprising how much better the Sesame model is in the conversation dialogue. However, the start of Sesame is only a demo, so it deserves a further exam if the full model comes out. According to the demo announcement, Sesam plans to open its model “in the coming months” and to expand to over 20 languages.

Topics
Artificial intelligence chatted

Leave a Reply

Your email address will not be published. Required fields are marked *