Skip to content

navkumar258/spring-batch-service

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

5 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Spring Batch Service

A Spring Batch application demonstrating CSV processing with database persistence and Kafka event publishing using Java 21 and Spring Batch 5.x.

πŸš€ Features

  • CSV Processing: Read and parse CSV files with configurable delimiters and headers
  • Database Integration: Write processed data to various database systems (MySQL, PostgreSQL, etc.)
  • Kafka Integration: Publish events to Kafka topics for real-time processing
  • Batch Processing: Efficient processing of large datasets with chunk-based processing
  • Error Handling: Comprehensive error handling with retry mechanisms
  • Monitoring: Job execution monitoring and metrics
  • Configurable: Flexible configuration for different data formats and destinations

πŸ“‹ Prerequisites

  • Java 21 or higher
  • Maven 3.8+ or Gradle 8+
  • Database: MySQL 8.0+, PostgreSQL 13+, or H2 (for testing)
  • Apache Kafka 3.0+ (optional, for event publishing)
  • IDE: IntelliJ IDEA, Eclipse, or VS Code

πŸ› οΈ Technology Stack

  • Java 21 - Latest LTS version with modern features
  • Spring Boot 3.x - Application framework
  • Spring Batch 5.x - Batch processing framework
  • Spring Data JPA - Database operations
  • Apache Kafka - Event streaming
  • Maven/Gradle - Build tools
  • Docker - Containerization (optional)

πŸ“š Architecture

The application follows a layered architecture pattern with clear separation of concerns:

Data Flow:

  1. CSV Input β†’ FlatFileItemReader reads CSV files
  2. Processing β†’ ItemProcessor applies business logic
  3. Output β†’ ItemWriter persists to database or publishes to Kafka
  4. Metadata β†’ JobRepository tracks execution status
  5. Monitoring β†’ Actuator endpoints provide metrics

Key Components:

  • Job: Defines the overall batch process
  • Step: Individual processing unit within a job
  • Chunk: Configurable batch size for processing
  • Reader: Extracts data from CSV source
  • Processor: Transforms and validates data
  • Writer: Persists data to target systems
  • Repository: Stores job execution metadata

Error Handling:

  • Retry mechanism for transient failures
  • Skip policy for invalid records
  • Comprehensive logging and monitoring

##️ Installation

1. Clone the Repository

git clone https://github.com/yourusername/spring-batch-service.git
cd spring-batch-service

2. Configure Database

Update application.yml with your database configuration:

spring:
  datasource:
    url: jdbc:mysql://localhost:3306/batch_service
    username: your_username
    password: your_password
    driver-class-name: com.mysql.cj.jdbc.Driver
  jpa:
    hibernate:
      ddl-auto: create-drop
    show-sql: true

3. Configure Kafka (Optional)

If using Kafka, update the configuration:

spring:
  kafka:
    bootstrap-servers: localhost:9092
    producer:
      key-serializer: org.apache.kafka.common.serialization.StringSerializer
      value-serializer: org.springframework.kafka.support.serializer.JsonSerializer

4. Build the Application

# Using Maven
mvn clean install

# Using Gradle
./gradlew build

πŸš€ Running the Application

Local Development

# Run with Maven
mvn spring-boot:run

# Run with Gradle
./gradlew bootRun

# Run the JAR file
java -jar target/spring-batch-service-1.0.0.jar

Docker (Optional)

# Build Docker image
docker build -t spring-batch-service .

# Run container
docker run -p 8080:8080 spring-batch-service

πŸ“ Project Structure

The project follows a standard Spring Boot structure with batch-specific components:

src/
β”œβ”€β”€ main/
β”‚ β”œβ”€β”€ java/
β”‚ β”‚ └── com/example/spring/batch/service
β”‚ β”‚ β”œβ”€β”€ config/
β”‚ β”‚ β”‚ β”œβ”€β”€ BatchConfig.java # Batch job configuration
β”‚ β”‚ β”‚ β”œβ”€β”€ DatabaseConfig.java # Database configuration
β”‚ β”‚ β”‚ └── KafkaConfig.java # Kafka configuration
β”‚ β”‚ β”œβ”€β”€ model/
β”‚ β”‚ β”‚ └── DataRecord.java # Data model/entity
β”‚ β”‚ β”œβ”€β”€ processor/
β”‚ β”‚ β”‚ └── DataProcessor.java # Business logic processor
β”‚ β”‚ β”œβ”€β”€ reader/
β”‚ β”‚ β”‚ └── CustomCsvReader.java # CSV file reader
β”‚ β”‚ β”œβ”€β”€ writer/
β”‚ β”‚ β”‚ β”œβ”€β”€ DatabaseWriter.java # Database writer
β”‚ β”‚ β”‚ └── KafkaWriter.java # Kafka event writer
β”‚ β”‚ └── SpringBatchbatch-serviceApplication.java
β”‚ └── resources/
β”‚ β”œβ”€β”€ application.yml # Application configuration
β”‚ β”œβ”€β”€ data/
β”‚ β”‚ └── sample.csv # Sample input data
β”‚ └── schema.sql # Database schema
└── test/
└── java/
└── com/example/batch/
└── BatchJobTest.java # Integration tests

Key Components:

  • Config: Contains all configuration classes for batch, database, and Kafka
  • Model: Data entities and DTOs
  • Processor: Business logic for data transformation
  • Reader: Custom CSV reading implementations
  • Writer: Database and Kafka writing implementations

βš™οΈ Configuration

Application Properties

# Batch Configuration
spring:
  batch:
    job:
      enabled: false  # Disable auto-start
    jdbc:
      initialize-schema: always
    
# Job Parameters
job:
  input-file: classpath:data/input.csv
  chunk-size: 1000
  retry-limit: 3
  
# Kafka Configuration
kafka:
  topic:
    name: batch-events
    partitions: 3
    replicas: 1

Job Parameters

Parameter Description Default
input.file Input CSV file path classpath:data/input.csv
chunk.size Processing chunk size 1000
retry.limit Number of retry attempts 3
output.destination Output destination (db/kafka) db

🎯 Usage Examples

1. Basic CSV to Database Processing

# Run job with parameters
java -jar spring-batch-service.jar \
  --spring.batch.job.names=csvToDatabaseJob \
  --input.file=classpath:data/input.csv \
  --chunk.size=500

2. CSV to Kafka Processing

# Run job with Kafka output
java -jar spring-batch-service.jar \
  --spring.batch.job.names=csvToKafkaJob \
  --input.file=classpath:data/input.csv \
  --output.destination=kafka

3. Programmatic Job Execution

@Autowired
private JobLauncher jobLauncher;

@Autowired
private Job csvToDatabaseJob;

public void runJob() throws Exception {
    JobParameters params = new JobParametersBuilder()
        .addString("input.file", "classpath:data/input.csv")
        .addLong("timestamp", System.currentTimeMillis())
        .toJobParameters();
    
    jobLauncher.run(csvToDatabaseJob, params);
}

πŸ“Š Monitoring and Metrics

Job Execution Monitoring

Access job execution details via REST endpoints:

# Get all job executions
curl http://localhost:8080/actuator/batch/jobs

# Get specific job execution
curl http://localhost:8080/actuator/batch/jobs/{executionId}

# Get job execution metrics
curl http://localhost:8080/actuator/metrics/batch.jobs

Health Checks

# Application health
curl http://localhost:8080/actuator/health

# Database health
curl http://localhost:8080/actuator/health/db

# Kafka health
curl http://localhost:8080/actuator/health/kafka

πŸ§ͺ Testing

Unit Tests

# Run unit tests
mvn test

# Run with coverage
mvn test jacoco:report

Integration Tests

# Run integration tests
mvn verify -P integration-test

Test Data

Sample CSV format for testing:

id,name,email,age,city
1,John Doe,[email protected],30,New York
2,Jane Smith,[email protected],25,Los Angeles
3,Bob Johnson,[email protected],35,Chicago

πŸ› οΈ Customization

Custom Data Model

@Entity
public class CustomDataRecord {
    @Id
    private Long id;
    private String name;
    private String email;
    private Integer age;
    private String city;
    
    // Getters, setters, constructors
}

Custom Processor

@Component
public class CustomDataProcessor implements ItemProcessor<InputRecord, OutputRecord> {
    
    @Override
    public OutputRecord process(InputRecord item) throws Exception {
        // Custom processing logic
        return new OutputRecord(item);
    }
}

Custom Reader/Writer

@Component
public class CustomCsvReader extends FlatFileItemReader<DataRecord> {
    // Custom reading logic
}

@Component
public class CustomDatabaseWriter extends JpaItemWriter<DataRecord> {
    // Custom writing logic
}

πŸ› Troubleshooting

Common Issues

Job not starting:

  • Check if spring.batch.job.enabled=false in application.yml
  • Verify job parameters are correct
  • Check database connectivity

CSV parsing errors:

  • Verify CSV format and delimiter
  • Check for encoding issues (UTF-8 recommended)
  • Validate header mapping

Database connection issues:

  • Verify database credentials
  • Check database server is running
  • Ensure proper JDBC driver

Kafka connection issues:

  • Verify Kafka broker is running
  • Check topic exists and has proper permissions
  • Validate serializer configuration

Debug Mode

Enable debug logging:

logging:
  level:
    com.example.batch: DEBUG
    org.springframework.batch: DEBUG

πŸ“ˆ Performance Optimization

Chunk Size Tuning

job:
  chunk-size: 1000  # Adjust based on memory and performance

Database Optimization

spring:
  jpa:
    properties:
      hibernate:
        jdbc:
          batch_size: 50
        order_inserts: true
        order_updates: true

Memory Management

# JVM options for large datasets
java -Xmx4g -Xms2g -jar spring-batch-service.jar

πŸ”’ Security

Database Security

  • Use connection pooling
  • Implement proper authentication
  • Use encrypted connections (SSL/TLS)

Kafka Security

  • Enable SASL authentication
  • Use SSL/TLS encryption
  • Implement proper ACLs

πŸ“š API Documentation

REST Endpoints

Endpoint Method Description
/api/jobs GET List all jobs
/api/jobs/{jobName}/executions GET Get job executions
/api/jobs/{jobName}/start POST Start a job
/api/jobs/{executionId}/stop POST Stop a job execution

Example API Usage

# Start a job
curl -X POST http://localhost:8080/api/jobs/csvToDatabaseJob/start \
  -H "Content-Type: application/json" \
  -d '{"inputFile": "classpath:data/input.csv"}'

# Get job status
curl http://localhost:8080/api/jobs/csvToDatabaseJob/executions

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Development Setup

# Clone and setup
git clone https://github.com/yourusername/spring-batch-service.git
cd spring-batch-service

# Install dependencies
mvn clean install

# Run tests
mvn test

# Start development server
mvn spring-boot:run

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ†˜ Support

Acknowledgments

πŸ“ˆ Version History

v1.0.0

  • Initial release
  • CSV to database processing
  • CSV to Kafka processing
  • Basic monitoring and metrics
  • Comprehensive error handling

⭐ If this project helps you, please consider giving it a star!

For more information about Spring Batch, visit the official documentation.

About

Spring Batch Service Example

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors