Search gets conversational, but SEOs still don’t need a voice strategy

Microsoft has integrated generative AI into its Bing search engine with ‘Bing chat’, a conversational search interface that can also accomplish tasks such as composing writing or editing text; and has also baked it into its Edge browser for good measure.

Meanwhile, Google is building a similar experience that it calls ‘Search Generative Experience’ (or SGE), through which the company says it is “reimagining what a search engine can do”, and it is currently open to testers from the United States.

Both Bing Chat and Google’s SGE make conversation a key part of the search experience. Bing proclaims that with Bing Chat, you can “Ask questions however you like” and “Get answers instead of being overwhelmed by options”, while Google has said that SGE will “unlock entirely new types of questions you never thought Search could answer”.

With search seemingly moving in a more conversational direction, voice search is experiencing something of a revival of interest, with some speculating that searchers will be more likely to speak queries aloud now that their search engine can also give a conversational reply. So, should brands optimise for voice to get ahead in 2023?

Although the voice experience is definitely improving with the introduction of generative AI, when it comes to optimising for voice search, voice suffers from the same problem it has always had – a lack of reporting.

Tracking voice queries for SEO

Voice search has been rumoured to be the ‘next big thing’ for some time (longtime readers will know I’m sceptical about this) and SEOs have long been keen for reporting to separate out voice-based queries from text queries in order to gauge volume, as well as determine which pages are ranking for voice searches.

Years ago, Google made various promises to bring more insight into voice queries to Google Search Console, with John Mueller even surveying SEOs on Twitter in 2017 as to what they would find useful from such data. However, two years later, Mueller suggested that the method of search input was not particularly important, opining,

“I don’t think that’s very useful information, if it’s essentially just a matter of entering the keywords in a different way.”

Mueller also pointed out in the same exchange that Google’s voice input “transcribes your query and searches normally”, meaning that the rest of the search journey is not fundamentally different due to having carried out a voice search (I’ll come back to that a bit later). Mueller described voice search as “essentially just a different kind of keyboard”, comparing it to swiping to type.

Many SEOs would beg to differ, as the received wisdom around voice queries has always been that they are more long-tail and conversational, as people are more likely to speak aloud in full sentences rather than employing the disjointed keyword searches they would with a keyboard. (Contrast “Where can I buy dresses in London?” with “London dress shops”). At the very least, getting insight into voice queries would be a way of testing whether that assumption is in fact true.

But given that Google has still yet to implement voice query tracking (and so has its competitor, Bing), the search giant clearly doesn’t see this as a worthwhile exercise – perhaps due to lack of volume, or because there really is less of a distinction between voice and text queries than we’ve always believed.

Why the ‘long tail’ workaround won’t work

This may be particularly true now that SGE is entering the picture, as searchers are encouraged to use full sentences and construct complex, multi-part queries such as “Will the Ikea Klippan loveseat fit into my 2019 Honda Odyssey if I fold down the seats?” (Using an example from Microsoft’s demo of Bing Chat).

And while dedicated SGE reporting may be introduced to Google Search Console and Bing Webmaster Tools – although Bing Chat reporting for BWT has also been promised multiple times and has yet to be rolled out, raising questions about what could be behind the delay – this won’t tell businesses whether the query was a voice or a text-based one.

“Look for long-tail and conversational search queries” has long been recommended as a proxy for dedicated voice query tracking, but this workaround has always been a bit risky, and has only become more so as both Google and Bing nudge searchers towards carrying out longer-tail queries.

Last year, Google rolled out related keyword suggestions in order to “mak[e] it … easier to dive deeper” into a search, suggesting additions to a searcher’s query in order to help them construct a long-tail search. Searchers can select as many additions as they like to progressively narrow down their query, and Google will assemble the search query for them, putting words into a grammatically correct order.

SEOs will only see the end result, making it impossible to know whether the searcher carried out this query unassisted or with Google’s help, and whether it was a text-based or voice query. However, it seems safe to assume that “long-tail” no longer automatically equals “voice” – if indeed it ever did.

The key trends to know in SEO – Part 2: The long tail and multimodal search

Is voice search worth optimising for?

Without any insight into whether searchers are arriving at your website via voice or whether optimisation efforts have had any effect, optimising for voice search is largely theoretical. But John Mueller’s point about the search experience not being fundamentally different still stands – even with SGE on the horizon.

Regardless of whether searchers use voice or text to reach a website, they’ll be carrying out their search on a device with a screen – a mobile, tablet, or laptop/desktop computer – and will use that screen to complete their onward journey. Smart speakers can also be used to carry out voice searches, of course, with no subsequent browsing experience, unless you’re using a smart display. So, how much does the method of ‘arrival’ matter? It only matters insofar as it might produce different queries to rank for – but as we’ve just established, the difference in queries may not even be that great.

There’s also the fact that voice search is still not ideal in many situations, as it requires speaking the query aloud (which may be awkward in public, or difficult amid background noise) and listening to the response from an assistant. Generative AI does improve the smoothness of the conversational exchange – I’ve tried it with Bing and found the results to be quite impressive – but even then, I don’t think the demand for voice is huge.

Reports have surfaced of Google and Amazon potentially beefing up their assistants with generative AI – Google is reportedly restructuring internally to accomplish this, while Amazon is said to have tapped its former Head Scientist for Alexa to lead a team focusing on Artificial General Intelligence – and Apple may also be developing its own generative AI technology, dubbed AppleGPT. All three companies have smart speaker products, and improvements in their voice assistants could mean more utility for smart speakers.

Whether this would benefit brands at all depends on the level of integration with third-party applications, with voice app discovery having historically been limited and uptake low. Ultimately it will be a space to watch – but as for voice search, it still doesn’t seem like there is much for SEOs to do if they’re already on top of other developments.

SGE represents a more meaningful change to the way that search is carried out, and may impact publishers’ traffic (possibly negatively if searchers are receiving their answers directly on the ‘SERP’ and not going onto the source), but it’s still very early there too, with Google’s version still in testing and Bing yet to reveal any analytics.

How meaningfully will it catch on? Bing and Google are betting on it being bigger than voice search, but we’ll have to wait and see.

Econsultancy offers a Foundation Series of eLearning that covers the fundamentals of search. Find out more about our training.