The Hibernate Search module works as a bridge between Hibernate ORM and full-text search engines such as Lucene or Elasticsearch. While performing a search, it enables us to work with JPA entities, and in the background, it transparently works with Lucene to provide a consistent experience.
This Hibernate search guide will discuss the core concepts and provide a step-by-step example for building full-text search in a Spring Boot application.
1. How does hibernate search work?
When using hibernate search, it is crucial to perform all CRUD operations through only hibernate. This way, Hibernate automatically updates the search index with every operation.
Later, we make a full-text search through hibernate as follows:
- We define a search query and the fields in which we want to search.
- Hibernate searches the index and sorts the results.
- Hibernate loads the entities from the database based on the search results.
The main benefit of using Hibernate search is that it works great with rather complex data and allows searching for keywords within text fields at the lightning-fast speed of full-text searches using Lucene.
2. Setting Up Hibernate Search with Lucene and Spring Boot
Let us walk through the steps one by one.
Step 1. Include Maven Dependencies
When using Maven, it is recommended to import Hibernate Search BOM as part of your dependency management and let it manage all the compatible versions. We are using the following dependencies:
- Spring Boot 3.3.4
- Spring Data JPA
- Hibernate Search 7.2.1.Final
- Lucene 9.11.1 (imported transitively)
- JBoss logging 3.6.1.Final
- H2 database (for persisting data in unit tests)
- Lombok
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>3.3.4</version>
<relativePath/> <!-- lookup parent from repository -->
</parent>
<properties>
<java.version>21</java.version>
<hibernate-search.version>7.2.1.Final</hibernate-search.version>
</properties>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-jpa</artifactId>
</dependency>
<dependency>
<groupId>org.hibernate.search</groupId>
<artifactId>hibernate-search-mapper-orm</artifactId>
</dependency>
<dependency>
<groupId>org.hibernate.search</groupId>
<artifactId>hibernate-search-backend-lucene</artifactId>
</dependency>
<dependency>
<groupId>org.jboss.logging</groupId>
<artifactId>jboss-logging</artifactId>
<version>3.6.1.Final</version>
</dependency>
<dependency>
<groupId>com.h2database</groupId>
<artifactId>h2</artifactId>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<optional>true</optional>
</dependency>
</dependencies>
<dependencyManagement>
<dependencies>
<!-- Hibernate Search BOM -->
<dependency>
<groupId>org.hibernate.search</groupId>
<artifactId>hibernate-search-bom</artifactId>
<version>${hibernate-search.version}</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
Step 2. Configuring Hibernate Search to use Lucene
In the application.properties file, configure Hibernate Search properties to use Lucene for indexing. Also, configure the datasource properties for performing the CRUD operation in the database.
# H2 Database settings
spring.datasource.url=jdbc:h2:mem:testdb
spring.datasource.driverClassName=org.h2.Driver
spring.datasource.username=sa
spring.datasource.password=password
spring.jpa.database-platform=org.hibernate.dialect.H2Dialect
# Enable Hibernate Search
hibernate.search.backend.type=lucene
hibernate.search.backend.directory.root=./indexes # Directory location where indexes will be saved
Step 3. Creating an @Indexed Entity
Next, we annotate a JPA entity with the @Indexed annotation, which makes it indexable and searchable. We also annotate the entity fields with other annotations, such as @FullTextField or @KeywordField. These annotations mark which fields should be indexed by Lucene.
Annotations | Description |
---|---|
@Indexed | Marks an entity to be indexed by Hibernate Search and makes it searchable. |
@FullTextField | Marks a field to be indexed for full-text search (using text analysis). |
@KeywordField | Marks a field to be indexed as a keyword (exact matching without tokenization). |
@GenericField | Marks a field to be indexed without specifying the type. This rely on Hibernate Search to infer the appropriate indexing strategy. |
@ScaledNumberField | A numeric field for integer or floating-point values that require a higher precision than doubles. |
@VectorField | A vector field that stores any float/byte array. |
@IndexedEmbedded | Used for embedding and indexing related entities or collections of entities within the current entity’s index. |
Full-text fields cannot be sorted on nor aggregated. It is allowed to use both @FullTextField and @KeywordField if you need both full-text search and sorting. Make sure to use a distinct name for each of those two fields.
import jakarta.persistence.*;
import lombok.AllArgsConstructor;
import lombok.Data;
import lombok.NoArgsConstructor;
import org.hibernate.search.engine.backend.types.Sortable;
import org.hibernate.search.mapper.pojo.mapping.definition.annotation.DocumentId;
import org.hibernate.search.mapper.pojo.mapping.definition.annotation.FullTextField;
import org.hibernate.search.mapper.pojo.mapping.definition.annotation.Indexed;
import org.hibernate.search.mapper.pojo.mapping.definition.annotation.KeywordField;
@Data
@AllArgsConstructor
@NoArgsConstructor
@Entity
@Table
@Indexed
public class Book {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
@DocumentId
private Long id;
@Column(nullable = false)
@FullTextField(name = "title")
@KeywordField(name = "sort_title", sortable = Sortable.YES)
private String title;
@Column(nullable = false)
@FullTextField(name = "author")
@KeywordField(name = "sort_author", sortable = Sortable.YES)
private String author;
}
Step 4. Creating Generic SearchRepository for Search Operations
Although we can use the ‘Search.session().search()‘ method directly in a repository or service class for an Entity, and it will work well. But if we have multiple entities, creating search methods for each entity separately will create lots of duplicate code, which will be hard to manage.
The better approach is to create generic methods for searches and let each JPA repository inherit them for its entity type.
Start by creating the generic SearchRepository interface defining all search-related methods. Feel free to change the method arguments as needed by the project.
import org.hibernate.search.engine.search.sort.dsl.SortOrder;
import java.io.Serializable;
import java.util.List;
public interface SearchRepository<T, ID extends Serializable> {
List<T> fullTextSearch(String text, int offset, int limit, List<String> fields, String sortBy, SortOrder sortOrder);
List<T> fuzzySearch(String text, int offset, int limit, List<String> fields, String sortBy, SortOrder sortOrder);
List<T> wildcardSearch(String pattern, int offset, int limit, List<String> fields, String sortBy, SortOrder sortOrder);
}
Next, a BaseRepository interface will be created that will combine the features of SearchRepository and JpaRepository.
import org.springframework.data.jpa.repository.JpaRepository;
import org.springframework.data.repository.NoRepositoryBean;
import java.io.Serializable;
@NoRepositoryBean
public interface BaseRepository<T, ID extends Serializable>
extends JpaRepository<T, ID>, SearchRepository<T, ID> {
}
Please note that we are using @NoRepositoryBean annotation on this interface. It tells Spring Data that a particular repository interface should not be instantiated as a Spring bean on its own. Rather we will provide its implementation in the JpaConfig class.
Next, create an implementation of BaseRepository and implement its search-related methods. The following code shows the implementation of a method for standard full-text search. We can add methods for other search types, such as wildcard searches and fuzzy searches. At the end of this tutorial, you can check out the demo implementations in the attached source code.
import jakarta.persistence.EntityManager;
import org.hibernate.search.engine.search.sort.dsl.SortOrder;
import org.hibernate.search.mapper.orm.Search;
import org.springframework.data.jpa.repository.support.JpaEntityInformation;
import org.springframework.data.jpa.repository.support.SimpleJpaRepository;
import java.io.Serializable;
import java.util.Collections;
import java.util.List;
public class BaseRepositoryImpl<T, ID extends Serializable>
extends SimpleJpaRepository<T, ID>
implements BaseRepository<T, ID> {
private final EntityManager entityManager;
public BaseRepositoryImpl(Class<T> domainClass, EntityManager entityManager) {
super(domainClass, entityManager);
this.entityManager = entityManager;
}
public BaseRepositoryImpl(JpaEntityInformation<T, ID> entityInformation, EntityManager entityManager) {
super(entityInformation, entityManager);
this.entityManager = entityManager;
}
@Override
public List<T> fullTextSearch(String text, int offset, int limit,
List<String> fields, String sortBy, SortOrder sortOrder) {
if (text == null || text.isEmpty()) {
return Collections.emptyList();
}
return Search.session(entityManager)
.search(getDomainClass())
.where(f -> f.match().fields(fields.toArray(String[]::new)).matching(text))
.sort(f -> f.field(sortBy).order(sortOrder))
.fetchHits(offset, limit);
}
// other methods ...
}
Now, we modify the @EnableJpaRepositories annotation in JpaConfig.class and ask it to use the BaseRepositoryImpl class as base implementation of BaseRepository interface, rather than creating a proxy implementation, using the 'repositoryBaseClass'
attribute.
import com.howtodoinjava.demo.repository.core.BaseRepositoryImpl;
import org.springframework.context.annotation.Configuration;
import org.springframework.data.jpa.repository.config.EnableJpaRepositories;
import org.springframework.transaction.annotation.EnableTransactionManagement;
@Configuration
@EnableTransactionManagement
@EnableJpaRepositories(basePackages = "com.howtodoinjava.demo.repository",
repositoryBaseClass = BaseRepositoryImpl.class)
public class JpaConfig {
//Other configs...
}
Step 5. Implementing BaseRepository
Next, whenever we create a JPA repository for an entity that should support CRUD and full-text search features, we need to extend the interface with BaseRepository.
In the following code, the BookRepository interface extends BaseRepository, so we can use it to create/update books in the database and search the Lucene index.
import com.howtodoinjava.demo.model.Book;
import com.howtodoinjava.demo.repository.core.BaseRepository;
import org.springframework.stereotype.Repository;
@Repository
public interface BookRepository extends BaseRepository<Book, Long> {
//...
}
Step 6. Create a @Transactional Service Class
In Spring Boot, generally, we perform the CRUD operations in transactional mode. In case of search operations, make sure to use readonly transactions using ‘@Transactional(readOnly = true)’ annotation.
@Service
@Transactional
public class BookService {
private final BookRepository bookRepository;
public BookService(BookRepository bookRepository) {
this.bookRepository = bookRepository;
}
public List<Book> findAll() {
return bookRepository.findAll();
}
public Book findById(Long id) throws ResourceNotFoundException {
return bookRepository.findById(id)
.orElseThrow(() -> new ResourceNotFoundException("Book not found for ID :: " + id));
}
public Book saveBook(Book book) {
return bookRepository.save(book);
}
public Book updateBook(Long id, Book bookDetails) throws ResourceNotFoundException {
Book existingBook = bookRepository.findById(id)
.orElseThrow(() -> new ResourceNotFoundException("Book not found for ID :: " + id));
existingBook.setTitle(bookDetails.getTitle());
existingBook.setAuthor(bookDetails.getAuthor());
return bookRepository.save(existingBook);
}
public void deleteBook(Long id) {
bookRepository.deleteById(id);
}
//== FULL TEXT SEARCH ==
@Transactional(readOnly = true)
public List<Book> fullTextSearchByTitle(String title) {
return bookRepository
.fullTextSearch(title, 0, 10, List.of("title"),"sort_title", SortOrder.ASC);
}
}
Step 7. Unit Testing
After writing all the core parts of implementing a full-text search feature using hibernate search and Lucene in a spring boot application, it’s time to perform some testing.
In the following test, we are saving four books in the database. Hibernate will save the books in the database and also create the Lucene index for these books. Next, we use the service method to perform a full-text search and validate the result.
import com.howtodoinjava.demo.config.JpaConfig;
import com.howtodoinjava.demo.model.Book;
import com.howtodoinjava.demo.repository.BookRepository;
import org.hibernate.search.engine.search.sort.dsl.SortOrder;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;
import org.springframework.context.annotation.Import;
import org.springframework.test.context.ActiveProfiles;
import java.util.List;
import static org.junit.jupiter.api.Assertions.assertEquals;
import static org.junit.jupiter.api.Assertions.assertFalse;
@SpringBootTest
@ActiveProfiles("test")
@Import(JpaConfig.class)
public class BookServiceTests {
@Autowired
private BookRepository bookRepository;
private BookService bookService;
@BeforeEach
void setUp() {
bookService = new BookService(bookRepository);
}
@Test
//@Commit
void testFullTextSearch() {
bookService.saveBook(new Book(1L, "Harry Potter - Part 1", "Test Author"));
bookService.saveBook(new Book(2L, "Harry Potter - Part 2", "Test Author"));
bookService.saveBook(new Book(3L, "Jungle Book - Part 1", "Test Author"));
bookService.saveBook(new Book(4L, "Jungle Book - Part 2", "Test Author"));
List<Book> books = bookRepository
.fullTextSearch("Jungle Book", 0, 10, List.of("title"), "sort_title", SortOrder.DESC);
assertFalse(books.isEmpty());
assertEquals(2, books.size());
}
}
The test passes and ensures that all the components are working correctly.
3. Using MassIndexer to Rebuild Indexes on Application Startup
When we integrate Hibernate Search into an existing application, first, we need to index the existing data from the database into Lucene indexes. The MassIndexer class is present for this very purpose and is useful to rebuild the indexes from the data stored in the database.
It provides start() (non-blocking) and startAndWait() (blocking) methods that read all the records from the database and index them. MassIndexer also provides methods to control the batch size and concurrency so the whole application is not impacted by high resource usage.
import jakarta.persistence.EntityManager;
import jakarta.persistence.PersistenceContext;
import lombok.extern.slf4j.Slf4j;
import org.hibernate.search.mapper.orm.Search;
import org.hibernate.search.mapper.orm.massindexing.MassIndexer;
import org.springframework.boot.context.event.ApplicationReadyEvent;
import org.springframework.context.ApplicationListener;
import org.springframework.stereotype.Component;
import org.springframework.transaction.annotation.Transactional;
@Component
@Transactional
@Slf4j
public class BuildLuceneIndexOnStartupListener
implements ApplicationListener<ApplicationReadyEvent> {
@PersistenceContext
private EntityManager entityManager;
@Override
public void onApplicationEvent(ApplicationReadyEvent event) {
log.info("Started Initializing Indexes");
MassIndexer massIndexer = Search.session( entityManager ).massIndexer();
massIndexer.idFetchSize(100)
.batchSizeToLoadObjects(25)
.threadsToLoadObjects(4);
try {
massIndexer.startAndWait();
} catch (InterruptedException e) {
log.warn("Failed to load data from database");
Thread.currentThread().interrupt();
}
log.info("Completed Indexing");
}
}
4. Summary
In this Spring Boot Hibernate Search demo with Lucene, we learned the following:
- Importing hibernate search module using the BOM dependency in a Spring Boot project.
- Annotating Entity classes with search-related annotations.
- Create BaseRepository interface that provides CRUD as well as fulltext search features to a JPA repository.
- Implementing search methods using Search.session(..).search(..) method and returning the fetched entities.
- Unit testing the search methods.
- Using hibernate search for mass indexing at the start of the application.
Happy Learning !!
Comments