Build a Batch Framework with Go


Build a Batch Framework with Go

Recently, I have just picked up Golang and decided to use what I learnt to build a Batch Framework with Go.


Batch Framework - Batch103

I called this project Batch103 (Started as Batch101. 103 is the 3rd version I decided to publish) ; the project is a simple batch framework inspired by SpringBatch - a popular batch framework in Java world. If you familiar with SpringBatch, you may find many similar features you used to see in SpringBatch. The source code of the framework is available at github (https://github.com/pengyeng/batch103.git

I will illustrates the high level  architecture of Batch103 using a diagram. (Please refer the overall architecture diagram) The framework comes with a Job Launcher which provides  a implementation class to launch the batch job. Job Launcher needs 3 key components to operate; which are :
  • Reader - handling the input of batch job
  • Processor - handling the logic to process the data
  • Writer - handling the output of batch job.
Data retrieved from Reader will flow from Processor and Writer using BatchData; which is a common data structure to carry the data. At the end of batch execution, Job Launcher will generate a statistic report to summarise the number of records processed & rejected in each process.


Overall Architecture

Batch Job Example 

To build a simple batch job using Batch103, you need to first create a Go module (example: c:\go\batchsample2) by running "go mod init example.com/batchsample2" in your project folder. Subsequently, you need to use "go get github.com/pengyeng/batch103" to include batch103 library into your Go project - batchsample2.

In this example, I am going to create an inbound file processing batch job to read a csv file (laptop.csv), perform validation check and then insert the retrieved data into mySQL database. In my project folder, I will create the following folder structures in batchsample2 project for my model, processor, reader and writer (please see the below picture) : 
  • model folder
  • processor folder
  • reader folder
  • writer folder

Directory Structure

In addition, I will also create a csv file called (laptop.csv) under project folder which will be used as my inbound file.

laptop.csv      

Before we create a reader, we need to first create a data model (laptop.go) to hold the data that is going to process. In this example, my data is a laptop record received from CSV which includes brand name, model, cpu core and memory size  (Please see below)





source code of laptop.go

Reader Component

With the data model, We can now create our reader component (laptopreader.go). Our reader component (laptopdreader.go) will need to import Batch103 framework and the data model we created earlier.

 

source code of laptopreader.go

As this example is a file processing reader, we will implement Batch103.FileReader and the Read() method in Batch103. (Please see above) The Read method needs to define the csv filename (laptop.csv), use csvFileReader object (from io library) to read the content and then insert the records into a Slice of BatchData which is a common model to flow the data throughout the entire batch execution. (Slice is a dynamic array in Golang


source code of laptopreader.go


Processor Component

The next component we need to create is processor component. Processor has a simple logic. It consists of a for loop to loop through each single record in BatchData. In this example, I built a simple validation to check if Model attribute is empty field. If validation failed, I will trigger Reject[batch103.stgProcess] to mark the record as Rejected at Processor stage (batch103.stgProcess indicates that the validation is at processor stage)  The rejected record will not be pickup in the subsequent process (In this case is writer)


source code of laptopprocessor.go


Writer Component

After we created the processor component, we can now create our writer component. Writer component will implement batch103.DBWriter and Write method. In the writer component, we need one extra library github.com/go-sql-driver/mysql as we need the database driver to connect with MySQL database. You can include the library by executing a "go get" command. 

source code of laptopwriter.go


source code of laptopwriter.go

Similar to processor, we will mark all records which are failed to be inserted as Rejected. we will trigger Reject(batch103.StgWrite) method to mark the records as Rejected at Writer Stage. To successfully execute the batch job, you need to have a mySQL database setup. you may use the below DDL to create the laptop table in go-data schema.

DDL of tbl_laptop


After all the 3 components are in placed, we can now create our main Go program (main.go) and launch our JobLauncher in the main function. (Please see below)


source code of main.go


Job Launcher allows us to define multiple processors and writers in case your batch job needs separate set of processing logic or output the data to multiple writers. Hence, we need to create a slice of ProcessorList and and slice of WriterList to append our LaptopProcessor and LaptopWriter into the list.

After we are all set, we can trigger the Run method to execute the batch. To test the program (main.go),  go to any terminal at project folder (batchsample2) and trigger "go run main.go". You should see a statistic report generated after the execution. You will find the number of records processed and rejected at Reader, Processor and Writer.

That's all I have in this example. I hope you find this article useful when you need to build a batch execution program using Go. Do drop me comment if you need further clarification. Thank you.


output of the batch execution




Comments

Popular posts from this blog

API Versioning with Node.JS

Build Retry-able API using idempotency key