Manticore Search 5

Sergey Nikolaev - May 27 '22 - - Dev Community

Image description

Today Manticore team is thrilled to announce Manticore Search 5.0.0. It took us almost 5 months, 450 commits and almost 50 thousand lines of code changed. We want to thank all of our:

who has helped us along our way. Please welcome the new version Manticore Search 5.0.0!

All stored by default

Our old users might remember that when Manticore was forked from Sphinx back in 2017 it’s database capabilities were much limited in a sense that it was more of an extension for another database rather than a database by itself, for example:

  • if you wanted to store original contents you had to create a string attribute and put it there which can take a lot of RAM since Manticore had only row-wise storage then which requires storing all attributes in RAM.
  • and you couldn’t even run CREATE TABLE to create a new table, you could only add a new schema to the configuration file.

Then in Manticore we added:

  • Document storage for full-text fields which made it possible to save RAM by storing original documents on disk.
  • Realtime mode to enable commands like CREATE TABLE.
  • New data-type text which is indexed + stored by default.

But in the plain mode you still had to specify stored_fields = ....

You don’t have to any more, it’s all stored by default since Manticore 5. In case you don’t need that you can disable it by specifying stored_fields = (empty value). It will make all the fields non-stored.

Secondary indexes

Manticore 5 enables support for Manticore Columnar Library 1.15.2 which adds support for Secondary indexes. The functionality is provided in Manticore 5 in its beta stage, so:

  • Building secondary indexes is on by default for plain and real-time columnar and row-wise indexes (if Manticore Columnar Library is in use),
  • but to enable it for searching you need to set secondary_indexes = 1 either in your configuration file or using SET GLOBAL.

The new functionality is supported in all operating systems except old Debian Stretch and Ubuntu Xenial.

Manticore Columnar Library uses Piecewise Geometric Model index, which exploits a learned mapping between the indexed keys and their location in memory. The succinctness of this mapping, coupled with a peculiar recursive construction algorithm, makes the PGM-index a data structure that dominates traditional indexes by orders of magnitude in space while still offering the best query and update time performance.

Pseudo sharding by default

The pseudo sharding which was added in previous releases was tested and optimized and is now enabled by default. To remind, the pseudo sharding parallelizes search query execution, utilizing all your CPU cores.

Web Command Line Interface

Manticore 5 provides new /cli endpoint for running SQL queries over HTTP even easier.

Image description

Read-only mode

New Read-only mode allows you specify listeners that process only read queries discarding any writes. It can be useful if you want to make Manticore Search accessible from the Internet or less secure part of your local network or just want to make sure the application which is supposed to read only can’t modify your data in any way.

Faster data loading

Previously you could provide multiple write commands via HTTP JSON protocol, but they were processed one by one, now they are handled as a single transaction. Bulk INSERT/REPLACE/DELETE via JSON over HTTP are now as bulk as via SQL. And even more:

Chunked transfer encoding

Manticore 5 gets support for Chunked transfer encoding in the HTTP protocol. You can now use chunked transfer in your application to transfer large batches with lower resource consumption (since you don’t need to calculate Content-Length). On the server’s side Manticore now always processes incoming HTTP data in streaming fashion without waiting for the whole batch to be transferred as previously, which:

  • decreases peak RAM consumption, which lowers a chance of OOM
  • decreases response time (our tests showed 11% decrease for processing a 100MB batch)
  • lets you overcome max_packet_size and transfer batches much larger than the largest allowed value of max_packet_size (128MB), e.g. 1GB at once.

100 Continue

In addition to that, the HTTP protocol now supports header 100 Continue which lets you transfer large batches from curl (including curl libraries used by various programming languages) faster. Curl by default does Expect: 100-continue and waits some time before actually sending the batch. Previously you had to add Expect: header, now it’s not needed.

All that makes Manticore 5 faster in terms of data loading via HTTP.

Manticore Search without full-text
Manticore Search being a full-text search engine always required at least one full-text field. Not anymore. You can now use Manticore even in cases not having anything to do with full-text search, therefore the requirement to have at least one full-text field is gone.

Fast fetching columnar attributes

In Manticore 5 we addded Fast fetching for attributes backed by Manticore Columnar Library: queries like select * from <columnar table> are now much faster than previously, especially if there are many fields in the schema.

Implicit cutoff

Manticore now doesn’t spend time and resources processing data you don’t need in the result set by choosing optimal cutoff value automatically. For some queries it improves the performance a lot. The downside is that it affects total_found in SHOW META and hits.total in JSON output. It is now only accurate in case you see total_relation: eq while total_relation: gte means the actual number of matching documents is greater than the total_found value you’ve got. To retain the previous behaviour you can use search option cutoff=0, which makes total_relation always eq.

HTTP JSON search options

Since Manticore 5 you can set various search options if you are using Manticore via the HTTP JSON protocol or any client based on it.

HTTP JSON nested filters

The JSON protocol also enables Nested filters support since Manticore 5. Previously you couldn’t code things like a=1 and (b=2 or c=3) in JSON: must (AND), should (OR) and must_not (NOT) worked only on the highest level. Now they can be nested.

And many more

The above is only a part of what has been done since Manticore 4.2.0 and has now become generally available in the new release. Please read about:

🚀 13 major changes
✅ 20+ minor changes
🐞 45 bug fixes

in the changelog.

We hope you’ll enjoy using the new version of Manticore Search. Please share your feedback about it by:

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .