Learn to read all the rows from an Excel (.xls or .xlsx) file using Spring Batch and save them in a database in a Spring Boot application. We are using the Spring Boot for its excellent infrastructure support and auto-configuration magic.
In this Spring Batch tutorial, we will read an Excel file containing people’s information and perform the following tasks:
- Read the Excel file using PoiItemReader
- Converting the rows into Person instances using BeanWrapperRowMapper or RowMapper interface
- Save the Person instances into the database using JpaRepository
- Logging the progress (read, write, and process)
- Scheduling the batch job with cron expression
1. The Excel File
In this tutorial, we are using a simple Excel file with only 4 columns and 6 rows. You can modify the Excel based on your requirements. Just make sure to modify the fields in @Entity class.

2. JPA Entity and Repository
Next, we create a class that will store the information we read from the Excel file. Here, the Person class acts as a JPA entity as well which will be stored in the database.
@Entity
public class Person {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
Long id;
String firstName;
String lastName;
Integer age;
Boolean active;
}
We should also have the JpaRepository interface which provides the factory methods for storing and updating the records in the database.
@Repository
public interface PersonRepository extends JpaRepository<Person, Long> {
}
With the basic setup done, let us move to configure the Spring batch infrastructure beans.
3. Maven
Include the following dependencies in your pom.xml file.
- spring-boot-starter-data-jpa and h2: JDBC support for creating and storing the entities.
- spring-boot-starter-batch: Batch support for creating and executing the batch jobs.
- spring-batch-excel: Utility classes for handling the Excel files in context to batch jobs.
- poi and poi-ooxml: Required by spring-batch-excel module for reading the Excel files in a single attempt or using stream.
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<java.version>21</java.version>
<spring.batch.version>5.1.2</spring.batch.version>
<poi.version>5.2.5</poi.version>
</properties>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-jpa</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-batch</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.batch.extensions</groupId>
<artifactId>spring-batch-excel</artifactId>
<version>0.1.1</version>
</dependency>
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi</artifactId>
<version>${poi.version}</version>
<optional>true</optional>
</dependency>
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi-ooxml</artifactId>
<version>${poi.version}</version>
<optional>true</optional>
</dependency>
<dependency>
<groupId>com.h2database</groupId>
<artifactId>h2</artifactId>
<scope>runtime</scope>
</dependency>
</dependencies>
4. Spring Batch Job and Step Configuration
Spring Batch will require the following beans from us to process the data from the Excel to the database:
- ItemReader: To read the Excel rows.
- ItemProcessor (Optional): To perform any intermediate processing in the POJO object. At this step, we can sanitize or validate the data, and even return a completely new object accepted by the batch writer.
- ItemWriter: To write the data into the database.
- Step: Defines which reader, processor, and writer are part of the whole process. We can define multiple Step beans if there are more things to do such as sending notifications. Ideally, each step must perform a unique responsibility.
- Job: Defines which step(s) are part of the job. It can define multiple steps in a single job.
All the beans are created in a @Configuration class so they are automatically registered with the application context.
4.1. ItemReader for Reading the Excel Rows
Reading an Excel file is itself a challenge and error-prone. Spring batch extensions module simplifies it by providing a PoiItemReader class which implements ItemReader interface and uses Apache POI to read an Excel file. It will read the file sheet for sheet and row for row. Not that this class is not thread-safe.
To map the data from rows to POJO, we also need RowMapper. Here, we are also utilizing the BeanWrapperRowMapper class from the extensions module. It automatically resolves the mapping between the column name (in the sheet) and field name (in POJO) by matching their names. Its name matching is “fuzzy” in the sense that it tolerates close matches, as long as the match is unique. If more than one match is found there will be an error.
@Bean
ItemReader<Person> reader() {
PoiItemReader<Person> reader = new PoiItemReader<>();
reader.setLinesToSkip(1); // Header rows are skipped
reader.setResource(new ClassPathResource(
environment.getRequiredProperty(PROPERTY_EXCEL_SOURCE_FILE_PATH)));
reader.setRowMapper(excelRowMapper());
return reader;
}
private RowMapper<Person> excelRowMapper() {
BeanWrapperRowMapper<Person> rowMapper = new BeanWrapperRowMapper<>();
rowMapper.setTargetType(Person.class);
return rowMapper;
}
If we want to fully control the mapper behavior, for example, excel field names are completely different, then we need to create the RowMapper ourselves. In this RowMapper, we get RowSet object which provides a Map of column name in excel as key and column value as map value.
public class PersonRowMapper implements RowMapper<Person> {
@Override
public Person mapRow(RowSet rowSet) throws Exception {
Person person = new Person();
person.setFirstName(rowSet.getProperties().getProperty("First Name"));
person.setLastName(rowSet.getProperties().getProperty("Last Name"));
person.setAge(Integer.valueOf(rowSet.getProperties().getProperty("Age")));
person.setActive(Boolean.valueOf(rowSet.getProperties().getProperty("Active")));
return person;
}
}
Next, we can inject this PersonRowMapper into the ItemReader bean.
@Bean
ItemReader<Person> reader() {
...
reader.setRowMapper(new PersonRowMapper());
return reader;
}
4.2. ItemProcessor for Intermediate Processing
Next, we create an ItemProcessor bean where we define data validation and transformation logic. It has to be written project-specific. We are just logging the values to verify that it has been called.
public class LoggingPersonProcessor implements ItemProcessor<Person, Person> {
private static final Logger LOGGER = LoggerFactory.getLogger(LoggingPersonProcessor.class);
@Override
public Person process(Person item) throws Exception {
LOGGER.info("Processing person information: {}", item);
return item;
}
}
4.3. ItemWriter for Writing the Records to Database
Next, we create the ItemWriter bean which is responsible for writing the Person entity to the database. It is again application-specific detail that controls its implementation.
In the write implementation, we can use transactions to control the commit and rollback processes using EntityManager or we can simply use the JpaRepository that saves and commits each record as it is processed.
public class DatabasePersonWriter implements ItemWriter<Person> {
private static final Logger LOGGER = LoggerFactory.getLogger(DatabasePersonWriter.class);
@Autowired
private PersonRepository personRepository;
@Override
public void write(Chunk<? extends Person> people) throws Exception {
LOGGER.info("Writing to the Database the information of {} people", people.size());
people.getItems().stream().forEach(personRepository::save);
}
}
Note that we can decide to read and process multiple rows from excel in each round, so Spring batch provides the records as Chunk. It is a List like collection that stores all the entities to be written into the database. To support multiple entities, we must iterate over all entities and save them one by one, as we did in the above code.
...
people.getItems().stream().forEach(personRepository::save);
...
4.4. Step for Sequencing the Reader, Processor, and Writer
A Step is used to explicitly represent the configuration of the steps to perform that mainly consist of reading, writing, and/or processing the records.
@Bean
Step step1() {
return new StepBuilder("excelFileToStep1", jobRepository)
.<Person, Person>chunk(1, transactionManager)
.reader(reader())
.processor(processor())
.writer(writer())
.build();
}
4.5. Job to Define the Steps in the Complete Process
A Job can consist of many steps. Each step can have different read and write processes. By creating a Job bean, we define which steps are part of the job. In our example, we have a single step.
The JobBuilder beans help in assembling all the steps in a sequence, if there are many.
@Bean
Job excelFileToDatabaseJob(Step step1) {
var builder = new JobBuilder("excelFileToDatabase", jobRepository);
return builder
.start(step1)
.build();
}
You can refer to the whole batch job configuration in the Github repository.
5. Running the Job
Generally, in any application, we can run a batch job in two ways:
- Scheduled execution using a cron expression
- On-demand execution triggered from a REST API or any other action
To schedule the batch job then we can make use of Spring @Scheduled annotation and supply the cron expression from a property file.
@Component
public class BatchJobLauncherScheduler {
private static final Logger LOGGER = LoggerFactory.getLogger(BatchJobLauncherScheduler.class);
private final Job job;
private final JobLauncher jobLauncher;
@Autowired
public BatchJobLauncherScheduler(@Qualifier("excelFileToDatabaseJob") Job job,
JobLauncher jobLauncher) {
this.job = job;
this.jobLauncher = jobLauncher;
}
@Scheduled(cron = "${excel.to.database.job.cron}")
void launchExcelFileToDatabaseJob() throws Exception {
LOGGER.info("Starting excelFileToDatabase job");
jobLauncher.run(job, newExecution());
LOGGER.info("Stopping excelFileToDatabase job");
}
private JobParameters newExecution() {
JobParameters jobParameters = new JobParametersBuilder()
.addJobParameter("currentTimestamp",
new JobParameter(LocalDateTime.now(), LocalDateTime.class))
.toJobParameters();
return jobParameters;
}
}
Do not forget to enable the Spring scheduling support using the @EnableScheduling annotation.
@SpringBootApplication
@EnableScheduling
public class SpringBatchApplication { ... }
To execute the job on-demand, we can get the references of Job and JobLauncher beans and use ‘jobLauncher.run(job, jobParameters)‘ method to fire up the batch job.
@SpringBootApplication
public class SpringBatchApplication implements CommandLineRunner {
@Autowired
@Qualifier("excelFileToDatabaseJob")
Job job;
@Autowired
JobLauncher jobLauncher;
public static void main(String[] args) {
SpringApplication.run(SpringBatchApplication.class);
}
@Override
public void run(String... args) throws Exception {
JobParameters jobParameters = new JobParametersBuilder()
.addJobParameter("currentTimestamp", new JobParameter(LocalDateTime.now(), LocalDateTime.class))
.toJobParameters();
jobLauncher.run(job, jobParameters);
}
}
6. Demo
Start the batch job by any means i.e. scheduled or on-demand, we can observe the Job process in console logs. The logs print the Excel rows read and processed.
2024-06-24T19:04:19.473+05:30 INFO 26232 --- [main] c.h.demo.SpringBatchApplication : Starting excelFileToDatabase job
2024-06-24T19:04:19.534+05:30 INFO 26232 --- [main] o.s.b.c.l.support.SimpleJobLauncher : Job: [SimpleJob: [name=excelFileToDatabase]] launched with the following parameters: [{'currentTimestamp':'{value=2024-06-24T19:04:19.473746300, type=class java.time.LocalDateTime, identifying=true}'}]
2024-06-24T19:04:19.559+05:30 INFO 26232 --- [main] o.s.batch.core.job.SimpleStepHandler : Executing step: [stepBuilder]
2024-06-24T19:04:20.055+05:30 INFO 26232 --- [main] c.h.demo.config.LoggingPersonProcessor : Processing person information: Person(id=null, firstName=Lokesh, lastName=Gupta, age=40, active=true)
2024-06-24T19:04:20.055+05:30 INFO 26232 --- [main] c.h.demo.config.DatabasePersonWriter : Writing to the Database the information of 1 people
Hibernate: insert into person (active,age,first_name,last_name,id) values (?,?,?,?,default)
2024-06-24T19:04:20.095+05:30 INFO 26232 --- [main] c.h.demo.config.LoggingPersonProcessor : Processing person information: Person(id=null, firstName=Alex, lastName=Kolenchisky, age=42, active=true)
2024-06-24T19:04:20.095+05:30 INFO 26232 --- [main] c.h.demo.config.DatabasePersonWriter : Writing to the Database the information of 1 people
Hibernate: insert into person (active,age,first_name,last_name,id) values (?,?,?,?,default)
2024-06-24T19:04:20.099+05:30 INFO 26232 --- [main] c.h.demo.config.LoggingPersonProcessor : Processing person information: Person(id=null, firstName=Biran, lastName=Schultz, age=38, active=true)
2024-06-24T19:04:20.099+05:30 INFO 26232 --- [main] c.h.demo.config.DatabasePersonWriter : Writing to the Database the information of 1 people
Hibernate: insert into person (active,age,first_name,last_name,id) values (?,?,?,?,default)
2024-06-24T19:04:20.102+05:30 INFO 26232 --- [main] c.h.demo.config.LoggingPersonProcessor : Processing person information: Person(id=null, firstName=Charles, lastName=Babej, age=60, active=true)
2024-06-24T19:04:20.102+05:30 INFO 26232 --- [main] c.h.demo.config.DatabasePersonWriter : Writing to the Database the information of 1 people
Hibernate: insert into person (active,age,first_name,last_name,id) values (?,?,?,?,default)
2024-06-24T19:04:20.105+05:30 INFO 26232 --- [main] c.h.demo.config.LoggingPersonProcessor : Processing person information: Person(id=null, firstName=John, lastName=Doe, age=70, active=false)
2024-06-24T19:04:20.105+05:30 INFO 26232 --- [main] c.h.demo.config.DatabasePersonWriter : Writing to the Database the information of 1 people
Hibernate: insert into person (active,age,first_name,last_name,id) values (?,?,?,?,default)
2024-06-24T19:04:20.107+05:30 INFO 26232 --- [main] c.h.demo.config.LoggingPersonProcessor : Processing person information: Person(id=null, firstName=Loreum, lastName=Ipsum, age=80, active=true)
2024-06-24T19:04:20.108+05:30 INFO 26232 --- [main] c.h.demo.config.DatabasePersonWriter : Writing to the Database the information of 1 people
Hibernate: insert into person (active,age,first_name,last_name,id) values (?,?,?,?,default)
2024-06-24T19:04:20.112+05:30 INFO 26232 --- [main] o.s.batch.core.step.AbstractStep : Step: [stepBuilder] executed in 553ms
2024-06-24T19:04:20.180+05:30 INFO 26232 --- [main] o.s.b.c.l.support.SimpleJobLauncher : Job: [SimpleJob: [name=excelFileToDatabase]] completed with the following parameters: [{'currentTimestamp':'{value=2024-06-24T19:04:19.473746300, type=class java.time.LocalDateTime, identifying=true}'}] and the following status: [COMPLETED] in 633ms
2024-06-24T19:04:20.180+05:30 INFO 26232 --- [main] c.h.demo.SpringBatchApplication : Stopping excelFileToDatabase job
We can verify the processed record in the database as well.

7. Conclusion
In this Spring Batch Excel reader example, we learned to read excel rows and write them into the database one by one. We learned to utilize the spring batch extensions module for the built-in Excel reader. Finally, we learned to launch the batch jobs from the spring scheduler or on-demand from any other triggers.
Happy Learning !!
Comments