Elasticsearch Query DSL and Java API

Elasticsearch Java API

Java Low Level REST client

  • Since 5.6.x

    <dependency>
    <groupId>org.elasticsearch.client</groupId>
    <artifactId>elasticsearch-rest-client</artifactId>
    <version>8.10.1</version>
    </dependency>

Java High Level REST Client

  • 5.6.x~7.17.x, Deprecated

    <dependency>
    <groupId>org.elasticsearch.client</groupId>
    <artifactId>elasticsearch-rest-high-level-client</artifactId>
    <version>7.17.13</version>
    </dependency>

Java Transport Client

  • 5.0.x~7.17.x, Deprecated

    <dependency>
    <groupId>org.elasticsearch.client</groupId>
    <artifactId>transport</artifactId>
    <version>7.17.13</version>
    </dependency>

Java API Client

  • Since 7.15.x

  • depends on Low Level REST Client

    <dependency>
    <groupId>co.elastic.clients</groupId>
    <artifactId>elasticsearch-java</artifactId>
    <version>8.10.0</version>
    </dependency>

Query

Basic query

Query DSL

{
"from": 0,
"size": 10,
"sort": [
{
"pub_time": {
"order": "desc"
}
}
],
"query": {
"bool": {
"must": [],
"must_not": [],
"should": []
}
},
"aggs": {
"term_aggregation": {
"terms": {
"field": "category"
}
}
}
}

Java Low Level REST Client

Response performRequest(String method, String endpoint, Map<String, String> params, HttpEntity entity, Header... headers)

Java High Level REST Client

SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery();
queryBuilder.must(xxx);
queryBuilder.mustNot(xxx);
queryBuilder.should(xxx);
searchSourceBuilder.from(0);
searchSourceBuilder.size(10);
searchSourceBuilder.sort("pub_time", SortOrder.DESC);
searchSourceBuilder.query(queryBuilder);
searchSourceBuilder.aggregation(
AggregationBuilders.terms("term_aggregation")
.field("category")
);
SearchRequest searchRequest = new SearchRequest("indexName");
searchRequest.source(searchSourceBuilder);
// Print DSL query
// System.out.println(searchRequest.source().toString())
SearchResponse searchResponse = restHighLevelClient.search(searchRequest);

Java API Client

// When your index contains semi-structured data or if you don’t have a domain object definition, you can also read the document as raw JSON data. You can use Jackson’s ObjectNode or any JSON representation that can be deserialized by the JSON mapper associated to the ElasticsearchClient.  
SearchResponse<ObjectNode> response = client.search(s -> s
.index("indexName")
.from(0)
.size(10)
.sort(so -> so
.field(FieldSort.of(f -> f
.field("pub_time")
.order(SortOrder.Desc))
)
)
.query(q -> q
.bool(b -> b
.must(m -> m.term(t -> t
.field("name")
.value("value")
))
)
)
.aggregations("term_aggregation", a -> a
.terms(t -> t.field("category"))
),
ObjectNode.class
);

Specify query fields

Query DSL

{
"_source": ["author", "host"],
"query": {}
}

Java High Level REST Client

searchSourceBuilder.fetchSource(new String[]{"author", "host"}, null);

Query by id

Query DSL

GET /my_index/{document_id}
// or
GET /my_index/{doc_type}/{document_id}

Java High Level REST Client

GetRequest getRequest = new GetRequest(indexName).id(id);
GetResponse getResponse = restHighLevelClient.get(getRequest);

Query by ids

Query DSL

GET /my_index/_search
{
"query":{
"ids": {
"values": ["202308227d464b3da5b01f966458cafa", "20230822dfc84f58b7c8243013da3063"]
}
}
}

Conditions

wildcard

Query DSL

{
"wildcard": {
"ip_region": "*山东*"
}
}

Java High Level REST Client

WildcardQueryBuilder ipRegionQuery = QueryBuilders.wildcardQuery("ip_region", "*山东*");

Logical Operation

must/must_not

{
"bool": {
"must": [
{
"match_phrase": {
"title": "医院"
}
}
]
}
}

Java High Level REST Client

BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
boolQueryBuilder.must(QueryBuilders.matchPhraseQuery("title", "医院"));

should

Query DSL

{
"bool": {
"should": [
{
"match_phrase": {
"title": "医院"
}
},
{
"match_phrase": {
"content": "医院"
}
}
],
"minimum_should_match": 1
}
}

Java High Level REST Client

BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
boolQueryBuilder.minimumShouldMatch(1);
boolQueryBuilder.should(QueryBuilders.matchPhraseQuery("title", "医院"));
boolQueryBuilder.should(QueryBuilders.matchPhraseQuery("content", "医院"));

Aggregation

terms

Query DSL

{
"aggs": {
"term_aggregation": {
"terms": {
"field": "category"
}
}
}
}

Java High Level REST Client

searchSourceBuilder.aggregation(
AggregationBuilders.terms("term_aggregation")
.field("category")
);

Elasticsearch no longer recommend using the scroll API for deep pagination. If you need to preserve the index state while paging through more than 10,000 hits, use the search_after parameter with a point in time (PIT).

In order to use scrolling, the initial search request should specify the scroll parameter in the query string, which tells Elasticsearch how long it should keep the “search context” alive (see Keeping the search context alive), eg ?scroll=1m.

The size parameter allows you to configure the maximum number of hits to be returned with each batch of results. Each call to the scroll API returns the next batch of results until there are no more results left to return, ie the hits array is empty.

POST /my-index-000001/_search?scroll=1m
{
"size": 100,
"query": {
"match": {
"message": "foo"
}
}
}

The result from the above request includes a _scroll_id, which should be passed to the scroll API in order to retrieve the next batch of results.

POST /_search/scroll                                                               
{
"scroll" : "1m",
"scroll_id" : "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAD4WYm9laVYtZndUQlNsdDcwakFMNjU1QQ=="
}

Update

Update by document ID

Using doc - to update multiple fields at once

POST /my_index/_doc/{document_id}/_update
{
"doc": {
"field1": "updated_value1",
"field2": "updated_value2"
}
}

Using script

POST /my_index/_doc/{document_id}/_update
{
"script": {
"source": "ctx._source.field_name = params.new_value",
"lang": "painless",
"params": {
"new_value": "updated_value"
}
}
}

Java High Level REST Client

// Create an instance of the UpdateRequest class
UpdateRequest request = new UpdateRequest("your_index", "your_type", "your_id");

// Prepare the update request
Map<String, Object> updatedFields = new HashMap<>();
updatedFields.put("field1", "updated value");
updatedFields.put("field2", "another updated value");
request.doc(updatedFields);

// Execute the update request
UpdateResponse response = client.update(request, RequestOptions.DEFAULT);

// Check the response status
if (response.status() == RestStatus.OK) {
System.out.println("Document updated successfully");
} else {
System.out.println("Failed to update document: " + response.status().name());
}

Update by document ids

Query DSL

POST /your-index/_update_by_query
{
"query":{
"ids": {
"values": ["xxx", "xxx"]
}
},
"script": {
"source": "ctx._source.field_name = 'updated-value'"
}
}

Java High Level REST Client

for (String id : ids) {
XContentBuilder contentBuilder = XContentFactory.jsonBuilder()
.startObject()
.field("status", "0") // update stauts to "0"
.endObject();

UpdateRequest updateRequest = new UpdateRequest(indexName, "data", id)
.doc(contentBuilder);

bulkRequest.add(updateRequest);
}

BulkResponse bulkResponse = dxRestHighLevelClient.bulk(bulkRequest);

if (bulkResponse.hasFailures()) {
System.out.println("has failures");
// Handle failure cases
} else {
// Handle success cases
}

Update by query

Query DSL

POST /your-index/_update_by_query
{
"query": {
"term": {
"field": "value"
}
},
"script": {
"source": "ctx._source.field = 'updated-value'"
}
}