If you don't need it to be clever and dynamic, and you're not too bothered about styling, then HTML's <input> tag has a typeahead option using <datalist>.
Unfortunately its scrolling performance is kinda laggy around 200 items or so in Chrome. Here's someone's example at 5,000 options which is absurd but you might think it would perform better being that it's a native control. https://jsfiddle.net/klesun/mfgteptf/
Does this mean that every time the user types something, they have to wait a roundtrip from the server for completion? Wouldn't this make this useless for average-to-fast typers?
Htmx sounds like a lovely idea, but doesn't it introduce a ton of latency compared to client-side operations?
Well, they "wait" for a fraction of a second maybe. And you put it behind a typing delay so it responds only when the user is ready.
Also, many autocompletes do a round trip anyway. The htmx version just returns the html already rendered, rather than, say, JSON that requires further processing and injecting.
Here are a couple of Go templates from a sample project that show it:
In response to `/autoc/<id>/search`, the server just responds with a list of strings matching the user's input, formatted with the `autoc_data` template. Htmx injects that into the `<datalist>` directly.
It's how any search engine does autocompletes... you can't possibly store all the queries clientside. Whether you're sending a tiny JSON or a tiny HTML snippet back doesn't really make much of a difference. It's usually fast enough, especially if you debounce/throttle the user input so you're not sending the query on every keystroke.
In my experience type ahead is most useful on datasets that are generally too large to load the whole dataset locally. There are certainly times when a list is too long to fit in the available space, but unlikely to grow unbounded, but such large lists that are known not to grow in an unbounded manner are rare in my experience.
"Not too bothered about styling" is quite important. Datalists are relatively uncommon and I think many users may not know what they are seeing and how they can interact with it.
Also datalists appear quite different across different browsers, which is fine of course for a native form control, but annoying if you're aiming for a more consistent look.
I just wish that you could natively do something that works more like select elements, so users see the contents of the option elements but the value of the option element is actually what is populated in the form.
The option and the form will work that way, but the input itself will then show the value instead of the option label... the input is not smart enough to present the option as a select does, it shows whatever you have in the "value" field.
At least on Edge, it got continuous substring matching right (illa shows vanilla), but failed to get the most important type of matching: arbitrary substring. Not a surprise, I only know a couple editors that support it. But it's truly the ideal form of filtering, especially when the entries are long. Think css, how much nicer would it be to type "mato" in devtools and get "margin-top"? Or, the ideal case that is somehow still missing – browser history entries: type the first char or two of the hostname, one or two from the path, and a couple form the query, and you've filtered everything down to just one or two options.
> Think css, how much nicer would it be to type "mato" in devtools and get "margin-top"?
I can't say that this type of search has ever crossed my mind. This type of search would seem to me to be greatly ambiguous. For example, if you have "tomato" in the corpus then that obviously supercedes "margin-top" for the search string "mato". If you know the entire corpus, then I suppose you can take advantage of shortcuts like this when you know there is only one match, but then again it would seem to me that simply writing "margin-top" would require less mental gymnastics.
If you've ever used any filtering in VS Code, this is the algorithm. Re tomato/margin-top, weight is given to matching the prefix exactly, so margin-top does indeed rank higher.
I like small libraries which are just a single file. That's why I am maintaining a fork of the original typeahead library from bootstrap. It is less than 400 lines of javascript:
I just went through looking for, and failing to find, a good typeahead library. I could not find one I liked. Typeahead is probably the most sophisticated UI element, and the most demanding in terms of what users expect.
I eventually rolled my own using tailwind’s scaffold: it ended up being about 300 lines of JS to handle things like keyboard actions, pressing enter, focus/non-focus, denounce search.
Looks really good - pleasant UX and seems nicely semantic/accessible! Also seems to allow slight mispellings which I really like.
I tried building my own autocomplete thing for ablf.io and it was *tough*. The dataset was massive, over 400k canonical titles. And I wanted to split out different variants/localizations of the titles/authors so it's much more than that really. I ended up using the 'trie-search' npm package as that was really the only way get decent performance. I also did some hueristic levenstein distance stuff to order the results most optimally for the user. I found it helpful to add variants to the trie with vowels and common conjuctions like 'The' and 'A' removed – since people tend to forget those when searching for book titles (then when searching across the trie I can attempt different variants).
I guess the main learnings are: it's important to pay attention to the type of data you've got and how users will try to access it; don't assume they know titles of things nor the spelling. Use trie structures! Also: with large datasets it's beneficial to do upfront data prepartion to build variants. RAM is pretty cheap. Thinking... I imagine "AI semantic autocomplete" will be happening pretty soon which is exciting.
We use Algolia, and we have an index of that is 10 million records, 1 that is 5 million and 1 that is 4 million. Performance and reliability have never been an issue. In fact, it's the only thing in our stack that we never have issues with. (knock on wood)
I feel I must mention that there exists a completely different approach to solving this problem, with HTMX or other "hypermedia" (let's keep all state on server) tools:
I appreciate where this comment is coming from, and I love efforts to simplify and remove over-engineering, but htmx doesn’t really keep all state on the server, since once the data is displayed, you now have another holder of state (HTML).
As soon as I started reading the docs on hx-sync and started imagining scenarios where that sort of functionality has been needed, I started thinking “man once the state gets this complex I’d really rather manage it in a proper programming language instead.”
I think it’s often reasonable to consider the DOM the “database” of the frontend, but it can also be nice to have an intermediary representation to create the appropriate transition. Inversion of control, you might say.
This library does NOT send any keystrokes to Algolia, unless you configure it to use its services. This library is NOT coupled to Algolia's server, unless you explicitly configure it to.
And 1 nice thing is: You can add multiple sources to it, meaning you can search at the same time in various local and/or remote sources/servers and integrate all results into 1 result list with rich layout possibilities.
While this is true , some plugins like the query suggestions have a very 'Algolian' format and dont work ootb without writing an additional wrapper around the response as the format differs from the output of the main plugin
Perhaps this is tangential, but I'm always amazed with input forms have a "Go" or "Send" button when it's supposed to do some on-the-fly or local computation or treatment of the input. So, why wait? Why force the user to perform yet another action? Programming allows for the process to start from the first character or number being typed into the form.
The fact that developers still put a "Send" button on most forms these days kind of amazes me. That makes me wonder. Am I wrong? When is it useful to have a "Go" or "Send" button, and demand that extra action from the user?
In a lot of applications, there is no need to signal readiness. You can simply start processing the input at once, and display it continuously, from the first character or number that is entered. This is true for most applications that run client side. We're talking everything from small calculators to text processing. In most cases the "OK" button can be completely omitted. It's of course a different matter if you need to digest it remotely.
Are you saying that forms which are not submitted shouldn't have a separate submit button?
If so, I might agree, up to a point.
However any form that has to be submitted to a server absolutely MUST have a submit button, for the case when user input is complete and the user is still reviewing all the fields.
As a user, I don't know if it only does local processing, and I prefer to click the button explicitly. Eg. if I accidentally paste a password and it starts loading stuff I would be alarmed.
Do you know what approach is better for autocomplete search: fuzzy search or tf-idf? Recently I read this post [0] about indexing Wikipedia using tf-idf. However, I've seen that fuzzy search is often used for autocomplete [1].
https://developer.mozilla.org/en-US/docs/Web/HTML/Element/da...