Unlocking the Power of Spring Data Elasticsearch: Returning Subobjects Only
When working with Spring Data Elasticsearch, you might encounter a situation where you need to retrieve only a specific subobject from the main document. This can be particularly useful when dealing with large datasets and reducing the amount of data transferred over the wire. But, you might wonder, is it possible in Spring Data Elasticsearch to return subobject only?

The Short Answer: Yes, It Is Possible!

Luckily, Spring Data Elasticsearch provides an elegant solution to achieve this. By leveraging the power of Elasticsearch’s `_source` filtering feature, you can specify which fields or subobjects to include or exclude from the search results.

Understanding the `_source` Field

In Elasticsearch, each document has a special field called `_source`, which stores the original JSON data submitted during indexing. This field is not analyzable, but it’s essential for retrieving the original document data. By default, `_source` is enabled, and it includes the entire document. However, you can configure it to include only specific fields or subobjects.

Configuring `_source` Filtering in Spring Data Elasticsearch

To return only a subobject in Spring Data Elasticsearch, you need to configure the `_source` filtering using the `@Query` annotation on your repository method. Here’s an example:

@Query("{ \"query\": { \"match_all\": {} }, \"_source\": [\"address\"] }")
List<User> findByAddress();

In this example, the `@Query` annotation specifies a query that returns all documents (using the `match_all` query) and includes only the `address` field in the `_source` response.

Using `SourceFilter` with `@Query`

Alternatively, you can use the `SourceFilter` class to define the `_source` filtering. This approach allows for more flexibility and reusability:

@Query("{ \"query\": { \"match_all\": {} } }")
List<User> findByAddress(@QueryAnnotation.SourceFilter(paths = {"address"}) SourceFilter sourceFilter);

In this example, the `SourceFilter` instance is passed as a method parameter, specifying the `address` field as the only path to include in the `_source` response.

Returning Subobjects with Nested Fields

What if you need to return a subobject with nested fields? Fear not, my friend! Spring Data Elasticsearch has got you covered:

@Query("{ \"query\": { \"match_all\": {} }, \"_source\": [\"address.street\"] }")
List<User> findByAddressStreet();

In this example, the `_source` filtering includes the `street` field nested within the `address` object.

Using `ObjectNode` to Define Subobjects

When working with complex subobjects, you can utilize the `ObjectNode` class from the Jackson library to define the subobject structure:

ObjectNode addressNode = JsonNodeFactory.instance.objectNode();
addressNode.put("street", "Main Street");
addressNode.put("city", "Anytown");

@Query("{ \"query\": { \"match_all\": {} }, \"_source\": [" + addressNode + "] }")
List<User> findByAddressStreetCity();

In this example, the `ObjectNode` instance defines a subobject with `street` and `city` fields, which are then included in the `_source` filtering.

Returning Subobjects with Arrays

What about returning subobjects with arrays? You can use the `includes` parameter to specify the array fields to include:

@Query("{ \"query\": { \"match_all\": {} }, \"_source\": { \"includes\": [\"phonenumbers\"] } }")
List<User> findByPhoneNumbers();

In this example, the `_source` filtering includes the entire `phonenumbers` array field.

Using `SourceFilter` with Arrays

Alternatively, you can use the `SourceFilter` class to define the `_source` filtering for arrays:

@Query("{ \"query\": { \"match_all\": {} } }")
List<User> findByPhoneNumbers(@QueryAnnotation.SourceFilter(includes = {"phonenumbers.*"}) SourceFilter sourceFilter);

In this example, the `SourceFilter` instance specifies that all elements within the `phonenumbers` array should be included in the `_source` response.

Best Practices and Considerations

When using `_source` filtering to return subobjects only, keep the following best practices and considerations in mind:

  • Use `_source` filtering judiciously, as it can impact performance. Only include the fields or subobjects that are necessary for your use case.
  • Avoid using `_source` filtering with large datasets, as it can lead to increased memory usage and slower query performance.
  • Consider using Elasticsearch’s `stored_fields` feature instead of `_source` filtering, especially for large datasets.
  • Use the `SourceFilter` class to define reusable `_source` filtering logic.


In conclusion, returning subobjects only in Spring Data Elasticsearch is a powerful feature that can help reduce data transfer and improve performance. By leveraging the `_source` filtering feature and using the `SourceFilter` class, you can elegantly retrieve specific subobjects or fields from your Elasticsearch documents. Remember to follow best practices and consider the implications of `_source` filtering on performance and memory usage.

Now that you’ve mastered the art of returning subobjects only in Spring Data Elasticsearch, go forth and optimize your Elasticsearch applications!

