11 comments

  • idoubtit 1 day ago
    I expected a toy project, but it is a usable library, which required a lot of work. Good job on delivering. A few comments:

    After reading "composer.json", I thought that the tests used a custom framework. I'm glad the project does not suffer from NIH syndrome, but the dev dependency on PHPUnit should be declared.

    There should a warning that it's only meant for some Western Latin languages. The normalization of the input is built on a character table for a handful of cases. That's not enough for some Latin languages, e.g. Turkish. And any input with Cyrillic, Arabic, CJK and so on, will be ignored.

    There is no Unicode normalization or cleanup. Real-life input have many corner cases, e.g. diacritics next to the characters, or invisible characters inside a word to prevent hyphenation. Unless I'm mistaken, this engine would treat the NFD form "fête" as "fe te", instead of the expected "fete", which the NFKD form "fête" produces. I suggest using ext-intl for Unicode normalization, at least as an option.

    Lastly, I can't think of a use case for this library. I've always had access to some external service (MySQL, Postgresql, Manticore Search, Solr, etc.) or to a PHP extension for a local Sqlite with FTS. Even for hobby projects, I haven't deployed to a shared hosting for more than two decades.

    • asmodios 1 day ago
      Thank you for the detailed feedback, it's genuinely valuable.

      You're right on all technical points : PHPUnit missing from dev dependencies is an oversight I'll fix, and the Unicode limitations are real and should be clearly documented. The NFD/NFKD case is a good catch.

      On the use case: fair point. My motivation came from testing MySQL and SQLite full-text search on a shared OVH hosting : the performance with filters was consistently disappointing. That's the itch this scratches. I understand it doesn't match your experience, and that's perfectly legitimate.

      • S15H 1 day ago
        I would actually reconsider adding phpunit as a dev dependency. It is a tool that runs independently from your project. Therefore it should not live in composer. I would recommend declaring the phar dependency with phive.

        https://docs.phpunit.de/en/12.5/installation.html#phar-or-co...

        I find this project very impressive and have bookmarked it for potential use in future projects. Thank you for making this.

        • kassner 1 day ago
          > I would recommend declaring the phar dependency with phive

          +1. This eliminates a whole class of bugs in which you declared phpunit as a dev dependency but end up using a class that it brought in without declaring as a regular dependency. Without an external linter, you can’t really catch that until your production code doesn’t bring the class in and throws a fatal error.

          • Oxodao 1 day ago
            1. Just add phpcsfixer and phpstan like any sane project

            2. If you use phpunit class in prod code, you deserve to get a fatal error

            • kassner 1 day ago
              1. Both tools will not catch it, you need something like https://github.com/maglnet/ComposerRequireChecker

              2. That doesn’t apply to PHPUnit specifically, but if you, for example, import PHP-cs-fixer as dev dependency, it will bring symfony/console, and if you rely on that on your own code without importing it on composer.json as a regular dependency, the class will be missing when you composer install for production.

              • Oxodao 1 day ago
                Ok I get what you're saying now, that's fair. Tbh I mainly do Symfony so most of what dev-dependencies use are already in the dependencies for me so it never happened
        • asimovDev 1 day ago
          have a phpunit.xml or something where you can document what version of PHPUnit is required to run the tests for this library. Since there are often deprecations and breakages between major versions of PHPUnit where something gets removed and such
      • otterley 1 day ago
        Do you have comparative benchmarks on the filter performance? I'm particularly interested in the SQLite FTS case.
  • captn3m0 1 day ago
    Zend used to maintain a PHP port of Lucene 15 years ago that I used, but not sure what happened to it.
    • asmodios 1 day ago
      Yes, Zend_Search_Lucene was dropped from Zend Framework 2 and never officially maintained for modern PHP. There's a community fork.
      • trog 1 day ago
        Any idea if it's any good? I used the old Lucene implementation ages ago and thought it was OK, though wasn't using it in a big way.
        • asmodios 17 hours ago
          The zf1s/zend-search-lucene fork still works and gets occasional updates, but it's essentially legacy maintenance on a PHP 5.3-era codebase. It crashed on PHP 7 for a while, and the original ZF team themselves recommended moving on.
  • francislavoie 1 day ago
    We've been using https://github.com/loupe-php/loupe, works quite well for small-to-medium single-instance apps.
    • asmodios 17 hours ago
      Loupe is a great project and more feature-rich than php-fts (stemming, geo, Damerau-Levenshtein typo tolerance). The dependency difference isn't just about Composer packages. Loupe requires pdo_sqlite and SQLite >= 3.35.0, which isn't guaranteed on shared hosting.
    • reconnecting 1 day ago
      Loupe seems to have a much longer dependencies list.
  • isaisabella 1 day ago
    Great start! This bridge between LIKE and a full-blown engine is exactly what's needed for the PHP long-tail.
  • ulrischa 1 day ago
    Great tool. Does it work with german umlaut (äöü)? I find it very useful because shared hosting is still big for me. I use ultra cheap shared hosting for nearly everything. No Server maintainance and no funky serverless stuff
  • gnyman 1 day ago
    You can also do something like this with static pages using https://pagefind.app/

    I built a ChatGPT/claude history search tool and it works surprisingly well.

    There are other tools also. Not to detract from this tool but just to inform people about alternatives.

    • 4lun 22 hours ago
      Not quite a comparable alternative. I use Pagefind and it's great for static sites but the search is all client side JS, there's no PHP (or otherwise) client to use it's generated index on the server.

      A comparable alternative might be TNTSearch: https://github.com/teamtnt/tntsearch though that requires some (common) PHP extensions to be available, which this library does not require.

  • cpollett 1 day ago
    code looks pretty clean. is small and compact, decent benchmarks. might want to consider using an autoloader for classes.
    • hparadiz 1 day ago
      The PSR-4 definition is properly defined in composer. There's no need to include an autoloader. Any project pulling this in would have it's own.
  • BoxedEmpathy 1 day ago
    This is super cool! Thank you!
  • napxuai 10 hours ago
    [dead]
  • ksamantha 1 day ago
    [flagged]