Painless scripts allow to customize a lot of things in Elasticsearch. One thing that (almost) every script has in common is the access of document fields. There are two different ways to do so and every developer should know them. Because it can have a huge impact on the performance.
Let’s start with some sample data.
"name": "Winter Monk",
"name": "Jungle Banana",
"name": "Green Flash",
As explained in the last article, we could use scripted sorting to add some custom sort logic.
This script maps the alignment to a custom sort order (that differs from the natural order of the alignment strings).
Two different ways to access document attributes
The example above shows one way to access document fields. The keyword doc refers to the document context whose content can be accessed in a dictionary-style.
This is the recommended way and uses a special data structure called doc_values that is created at index time. Think of it as a mapping between a document and all its terms of every field. It is used for sorting, aggregations and the fast lookup of values from scripts. Elasticsearch loads required entries to RAM. That requires more memory but results in a faster execution. And since search is (in most cases) about query speed, this approach is the one you should go for.
It works only for singe-valued fields, so arrays or more complex objects are not supported. Also, since it depends on loading all field terms into memory, it should be used for non-analyzed fields (keywords, numbers).
The other option is accessing the document source directly.
This gives you the full access of the document, even on arrays or nested objects. But there is a pitfall. Elasticsearch has to parse the document source to retrieve the values. That allows also to access all the document fields that were not indexed. And that eats a lot of time. Whenever possible, you should avoid that.
Accessing fields via the source is not an option, except your index is really, really small. If you need to lookup something that is not part of the doc_values, you should rather consider to remodel your index mapping.