• Dev
  • An extension that handles fulltext search

Interesting, I was waiting for this although I hoped it had support for a more lightweight solution like MeiliSearch/Sonic/typesense indeed 😂

I should be available for testing if necessary, my forum has a bit more than 500k posts and it's very active. But I need to check the cost of running Elasticsearch...

@luceos when you say managed Elastic, you mean the cloud offering that I can see here? https://www.elastic.co/pricing/

Thanks.

    010101 I honestly saw that one too late and I haven't tested it. Elastic is widely available and is known to scale well...

    matteocontrini once it's available you can give it a spin at all times. It will be open source.

    010101 Probably it will scale like meili, so it is good for small/medium forums (the biggest gain will probably be better misstype handling compared to a trigram based search in the database) but for really huge communities it will be better to use elastic.

    9 days later

    I have just tagged 0.0.6 and made the extension open source. Once I feel comfortable to be used more widely I will create a discussion in Extensions. For now, feel free to give it a spin:

      luceos This extension requires PHP8. Will it be compatible with PHP7.4? flarum/akismet does not work on PHP8 so forums that use Akismet cannot use this extension.


      I see that ElasticSearch does not support my language (polish) by default and requires a plugin. It would be nice to be able to set Analyzer language by typing instead of being limited to choosing from a limited list.

        rafaucau This extension requires PHP8. Will it be compatible with PHP7.4?

        Nope sorry. In 25 days active support for 7.4 will be dropped, after that it will only receive security updates. And yes we can add a way to add support for more analyzers, I think I'll just make the list extensible from the backend then. I dislike the free choice option as this would cause too much support on the extension if people fill in something that Elastic doesn't understand.


        TODO:

        • discussion titles are not taken into account
        • offer a button in the admin area to kick off the indexing of your forum (queue is recommended for larger communities)

          luceos Nope sorry. In 25 days active support for 7.4 will be dropped, after that it will only receive security updates.

          You know that 60-70% of Flarum installations is on PHP 7?

            rob006 Given the extension is primarily designed to run in the Blomstra environment where they have full control over PHP versions and what not I don't find it surprising at all.

            Truth be told they could just keep it closed source and available to their customers only, instead they've open sourced it so anyone can use it (assuming the right PHP versions and what not) so I'd say that's already pretty generous.

              tankerkiller125 This informations may also be useful to Blomstra. They will probably find a customer who wants to install an extension that does not support PHP8 and use Elasticsearch at the same time.
              As I mentioned earlier even one of the official Flarum extensions doesn't work on PHP8 which probably blocks many forums from moving to the newer PHP version.

              Tagged 0.0.8 which fixes the discussion title search.

              Todo:

              • I've added an endpoint that can kick off indexing but: a) a button has to be created in the admin that hits that endpoint and b) that controller needs to execute faster as it iterates over discussions and posts in batches of 50.
              • Improve relevancy handling (discussion titles before post content for instance).
              • Allow configuring language analyzers as elastic allows plugins for that or push that post-stable as part of extender logic.

              I'll be testing my implementation on my playground community and then push it onto Greenhouse. After that it will be tested on a multi million posts community.

              And 0.0.9 which fixes an issue with sending indexing jobs to a specific queue.

                luceos Congratulations! I'm sure this hasn't been easy. Or maybe it has.

                Tagged 0.0.14:

                • added a --continue flag to allow continuing mass indexing.
                • pushed recreation of index and mappings to --recreate flag.
                • introduced --only flag to allow indexing specific data types (posts, discussions).
                • added throttling with --throttle.
                • added sorting to queries to prevent retries to read different results.
                • allow forcing the queue to dispatch indexing jobs on.

                pkernstock I was afraid this question would come and yes that's possible. Understand that this extension stores the post or title string under the content field. But indexing for different languages (analyzers) requires storing that same field again so that elastic understands it. I'd be okay with a PR to get this functionality supported, we don't need it imminently yet though.

                8 days later

                This extension is now at version 0.1.4 and is running on https://forum.mutluanneleriz.com by volkan28 👏 .

                Some of the changes:

                • support for byobu private discussions page
                • excluding hidden discussions from being indexed
                • instead of only processing results through elastic when browing the SPA, a direct visit to the page is now also handled by elastic
                • discussion titles now have a higher score than post content
                • added comment count for sorting by most replies
                • and much more..

                What isn't working right now is the tags page. But the question is where to move now. The following options exist:

                • start PR'ing to core to improve search and filtering there so that it will support elastic like it does now
                • continue extending blomstra/search so that it works with the most used extensions (follow-tags or discussion language for instance).
                • start working on making the elastic implementation in blomstra/search interchangeable by adding drivers

                This extension was meant to resolve the issue of searching fulltext columns. But it is already tackling filtering as well. The Gambit system that Flarum implements is a tough nut to crack, as it would require an implementation in each search/filter driver/extension.

                I'd love to hear your opinion on this.

                Great work. Elastic search is really doing good job and will be better with the feedbacks.