Autocomplete – A JavaScript library for building autocomplete experiences

onion2k · on June 8, 2023

If you don't need it to be clever and dynamic, and you're not too bothered about styling, then HTML's <input> tag has a typeahead option using <datalist>.

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/da...

psygn89 · on June 8, 2023

Unfortunately its scrolling performance is kinda laggy around 200 items or so in Chrome. Here's someone's example at 5,000 options which is absurd but you might think it would perform better being that it's a native control. https://jsfiddle.net/klesun/mfgteptf/

xnx · on June 8, 2023

I couldn't detect any lag in Firefox or Chrome on a laptop. Is this more of a problem on older mobile devices?

bunga-bunga · on June 8, 2023

The cool part of datalist is that you can listen to input events on the field, query the server and create your options in the datalist.

kitd · on June 8, 2023

Yes, I've done this with Htmx. Send the input content to the server and respond with the new <datalist> html that gets automatically swapped in.

Really simple engineering.

maleldil · on June 8, 2023

Does this mean that every time the user types something, they have to wait a roundtrip from the server for completion? Wouldn't this make this useless for average-to-fast typers?

Htmx sounds like a lovely idea, but doesn't it introduce a ton of latency compared to client-side operations?

kitd · on June 8, 2023

Well, they "wait" for a fraction of a second maybe. And you put it behind a typing delay so it responds only when the user is ready.

Also, many autocompletes do a round trip anyway. The htmx version just returns the html already rendered, rather than, say, JSON that requires further processing and injecting.

Here are a couple of Go templates from a sample project that show it:

    {{define "autoc_data" -}}
        {{- range . -}}
            <option>{{.}}</option>
        {{- end -}}
    {{- end}}
    
    {{define "autoc" -}}
        <input name="{{.Id}}" id="{{.Id}}" type="text" list="{{.Id}}_data"
            hx-get="/autoc/{{.Id}}/search" 
            hx-trigger="keyup delay:300ms"
            hx-target="#{{.Id}}_data">
        <datalist id="{{.Id}}_data">
            {{- template "autoc_data" -}}
        </datalist>
    {{- end}}

In response to `/autoc/<id>/search`, the server just responds with a list of strings matching the user's input, formatted with the `autoc_data` template. Htmx injects that into the `<datalist>` directly.

solardev · on June 9, 2023

It's how any search engine does autocompletes... you can't possibly store all the queries clientside. Whether you're sending a tiny JSON or a tiny HTML snippet back doesn't really make much of a difference. It's usually fast enough, especially if you debounce/throttle the user input so you're not sending the query on every keystroke.

zeroonetwothree · on June 8, 2023

That’s how almost all autocompletes work

maleldil · on June 8, 2023

Unless you have the data on the client side already, which isn't the case for Htmx.

kitd · on June 8, 2023

True, but then you are required to load all possible values for all autocompletable input fields at page load, which in many cases is unreasonable.

As mentioned, many (most even?) autocompletes do it on the fly, and probably without you realising.

MobiusHorizons · on June 8, 2023

In my experience type ahead is most useful on datasets that are generally too large to load the whole dataset locally. There are certainly times when a list is too long to fit in the available space, but unlikely to grow unbounded, but such large lists that are known not to grow in an unbounded manner are rare in my experience.

Alifatisk · on June 8, 2023

Is there a way to offer an equivalent feature but for people turning off javascript on the web-browser?

My first thought is that frameworks like StimulusReflex / Hotwire can do something like that?

sublinear · on June 8, 2023

Both of those frameworks require javascript to be enabled for anything requiring an event handler.

airblade · on June 8, 2023

"Not too bothered about styling" is quite important. Datalists are relatively uncommon and I think many users may not know what they are seeing and how they can interact with it.

Also datalists appear quite different across different browsers, which is fine of course for a native form control, but annoying if you're aiming for a more consistent look.

JeremyNT · on June 8, 2023

This is a really simple and flexible approach.

I just wish that you could natively do something that works more like select elements, so users see the contents of the option elements but the value of the option element is actually what is populated in the form.

kitd · on June 8, 2023

Isn't that what the `value` attribute of `<option>` is for?

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/op...

PKop · on June 8, 2023

The option and the form will work that way, but the input itself will then show the value instead of the option label... the input is not smart enough to present the option as a select does, it shows whatever you have in the "value" field.

jakear · on June 8, 2023

At least on Edge, it got continuous substring matching right (illa shows vanilla), but failed to get the most important type of matching: arbitrary substring. Not a surprise, I only know a couple editors that support it. But it's truly the ideal form of filtering, especially when the entries are long. Think css, how much nicer would it be to type "mato" in devtools and get "margin-top"? Or, the ideal case that is somehow still missing – browser history entries: type the first char or two of the hostname, one or two from the path, and a couple form the query, and you've filtered everything down to just one or two options.

xyzzy_plugh · on June 8, 2023

> Think css, how much nicer would it be to type "mato" in devtools and get "margin-top"?

I can't say that this type of search has ever crossed my mind. This type of search would seem to me to be greatly ambiguous. For example, if you have "tomato" in the corpus then that obviously supercedes "margin-top" for the search string "mato". If you know the entire corpus, then I suppose you can take advantage of shortcuts like this when you know there is only one match, but then again it would seem to me that simply writing "margin-top" would require less mental gymnastics.

jakear · on June 8, 2023

If you've ever used any filtering in VS Code, this is the algorithm. Re tomato/margin-top, weight is given to matching the prefix exactly, so margin-top does indeed rank higher.

https://github.com/microsoft/vscode/blob/main/src/vs/base/co...

simlan · on June 8, 2023

FYI it does not work on Firefox on Android :(.

mg · on June 8, 2023

I like small libraries which are just a single file. That's why I am maintaining a fork of the original typeahead library from bootstrap. It is less than 400 lines of javascript:

https://github.com/no-gravity/0g-typeahead

It is humming away nicely, doing hundreds of thousands of typeaheads each day across my websites.

I always think I should finally do the work and get rid of the jquery dependency. But so far, I just did not got around doing it.

lordofmoria · on June 8, 2023

I just went through looking for, and failing to find, a good typeahead library. I could not find one I liked. Typeahead is probably the most sophisticated UI element, and the most demanding in terms of what users expect.

Two pro tips:

1. Apparently this is also called “combo box” - Tailwind has one: https://tailwindui.com/components/application-ui/forms/combo...

I eventually rolled my own using tailwind’s scaffold: it ended up being about 300 lines of JS to handle things like keyboard actions, pressing enter, focus/non-focus, denounce search.

But the benefit? It does exactly what I want.

1337shadow · on June 8, 2023

Here's one in 228 lines with an additional 190 to support hidden <selects> https://yourlabs.io/oss/autocomplete-light/-/blob/master/aut... It's a trivial web component in vanilla js, it's tested in python: https://yourlabs.io/oss/autocomplete-light/-/blob/master/tes...

2Gkashmiri · on June 8, 2023

Nice.

Your repo doesn't have a license.

mg · on June 8, 2023

Good point. The license for the actual library is on top of the file:

https://github.com/no-gravity/0g-typeahead/blob/master/0g-ty...

The project is pretty much that one file. Twitter put the license there, so it did never cross my mind to remove it and put it into an extra file.

But it might make sense to put the whole repo (which is just the typeahead.js file plus a little demo) under that license to avoid confusion.

jabo · on June 8, 2023

You can also use this library with Typesense, which is an open source alternative to Algolia: https://github.com/typesense/typesense-autocomplete-demo

Disclaimer: I work on Typesense.

lol768 · on June 8, 2023

It's quite easy to plug into any results provider, in fact. I built a proof of concept using Lunr (dataset too small for anything heavier-weight).

padolsey · on June 8, 2023

Looks really good - pleasant UX and seems nicely semantic/accessible! Also seems to allow slight mispellings which I really like.

I tried building my own autocomplete thing for ablf.io and it was *tough*. The dataset was massive, over 400k canonical titles. And I wanted to split out different variants/localizations of the titles/authors so it's much more than that really. I ended up using the 'trie-search' npm package as that was really the only way get decent performance. I also did some hueristic levenstein distance stuff to order the results most optimally for the user. I found it helpful to add variants to the trie with vowels and common conjuctions like 'The' and 'A' removed – since people tend to forget those when searching for book titles (then when searching across the trie I can attempt different variants).

I guess the main learnings are: it's important to pay attention to the type of data you've got and how users will try to access it; don't assume they know titles of things nor the spelling. Use trie structures! Also: with large datasets it's beneficial to do upfront data prepartion to build variants. RAM is pretty cheap. Thinking... I imagine "AI semantic autocomplete" will be happening pretty soon which is exciting.

dham · on June 8, 2023

We use Algolia, and we have an index of that is 10 million records, 1 that is 5 million and 1 that is 4 million. Performance and reliability have never been an issue. In fact, it's the only thing in our stack that we never have issues with. (knock on wood)

vladstudio · on June 8, 2023

I feel I must mention that there exists a completely different approach to solving this problem, with HTMX or other "hypermedia" (let's keep all state on server) tools:

https://htmx.org/examples/active-search/

IggleSniggle · on June 8, 2023

I appreciate where this comment is coming from, and I love efforts to simplify and remove over-engineering, but htmx doesn’t really keep all state on the server, since once the data is displayed, you now have another holder of state (HTML).

As soon as I started reading the docs on hx-sync and started imagining scenarios where that sort of functionality has been needed, I started thinking “man once the state gets this complex I’d really rather manage it in a proper programming language instead.”

I think it’s often reasonable to consider the DOM the “database” of the frontend, but it can also be nice to have an intermediary representation to create the appropriate transition. Inversion of control, you might say.

colesantiago · on June 8, 2023

Is there an open source and self hosted version of this that isn't tightly coupled to Algolia's server and doesn't send keystrokes to Algolia?

trinovantes · on June 8, 2023

I use [1] for searching a fully static site

[1] https://github.com/farzher/fuzzysort

naiv · on June 8, 2023

https://github.com/typesense/typesense-autocomplete-demo

Might give you some ideas

gbrits · on June 8, 2023

Isn't the Angolia backend just one of the possible 'sources' as they put it? You can easily add your own backend as a source I imagine

tawy11111111 · on June 8, 2023

That is 100% correct.

This library does NOT send any keystrokes to Algolia, unless you configure it to use its services. This library is NOT coupled to Algolia's server, unless you explicitly configure it to.

And 1 nice thing is: You can add multiple sources to it, meaning you can search at the same time in various local and/or remote sources/servers and integrate all results into 1 result list with rich layout possibilities.

naiv · on June 8, 2023

While this is true , some plugins like the query suggestions have a very 'Algolian' format and dont work ootb without writing an additional wrapper around the response as the format differs from the output of the main plugin

joemcelroy · on June 9, 2023

https://github.com/searchkit/searchkit is an instantsearch adapter for elasticsearch / opensearch

gerardnico · on June 8, 2023

Nice library.

I also like very much to use the floating ui headless component

Example for an autocomplete/combo box https://codesandbox.io/s/fragrant-water-bsuirj?file=/src/App...

Technotroll · on June 8, 2023

Perhaps this is tangential, but I'm always amazed with input forms have a "Go" or "Send" button when it's supposed to do some on-the-fly or local computation or treatment of the input. So, why wait? Why force the user to perform yet another action? Programming allows for the process to start from the first character or number being typed into the form.

The fact that developers still put a "Send" button on most forms these days kind of amazes me. That makes me wonder. Am I wrong? When is it useful to have a "Go" or "Send" button, and demand that extra action from the user?

mft_ · on June 8, 2023

Because... that's how the user signals that the form is complete and correct, and ready to go, or to be submit -ted?

Maybe I'm not understanding your frustration...?

Technotroll · on June 8, 2023

In a lot of applications, there is no need to signal readiness. You can simply start processing the input at once, and display it continuously, from the first character or number that is entered. This is true for most applications that run client side. We're talking everything from small calculators to text processing. In most cases the "OK" button can be completely omitted. It's of course a different matter if you need to digest it remotely.

lelanthran · on June 8, 2023

I don't follow.

Are you saying that forms which are not submitted shouldn't have a separate submit button?

If so, I might agree, up to a point.

However any form that has to be submitted to a server absolutely MUST have a submit button, for the case when user input is complete and the user is still reviewing all the fields.

throwaway290 · on June 8, 2023

As a user, I don't know if it only does local processing, and I prefer to click the button explicitly. Eg. if I accidentally paste a password and it starts loading stuff I would be alarmed.

auraham · on June 9, 2023

Do you know what approach is better for autocomplete search: fuzzy search or tf-idf? Recently I read this post [0] about indexing Wikipedia using tf-idf. However, I've seen that fuzzy search is often used for autocomplete [1].

[0] https://bart.degoe.de/building-a-full-text-search-engine-150...

[1] https://github.com/farzher/fuzzysort

airblade · on June 9, 2023

It probably depends on whether you are searching a body of natural language text (use tf-idf) or a list of things (use fuzzy search).

Exuma · on June 8, 2023

Very nice, its cool to finally have something like this w/o algolia backend. too bad i just built 2 of these myself and it was agonizing

superchris · on June 8, 2023

Yet another UI open source project that should be web component (custom HTML element) but isn't.

egometry · on June 9, 2023

Autocomplete: The Only Reason We Know What A Trie Is(tm)!