Card Data Processing Pipeline

NotaAI, 2019. 9. ~ 2019. 12.

Project Summary

This is a project conducted during the NotaAI internship period. It was a project in which statistical indicators were periodically calculated from over 100 million card transaction data accumulated by each store and delivered to the customer. I designed the overall pipeline like below.

card-data-processing-pipeline

  1. Admin Page(ReactJS & nginx): Users can request or monitor data processing operations here.
  2. API Server(Flaks & gunicorn): Receiving a request from user. Then it creates a job, split it into tasks, and load it into the broker.
  3. Broker(Redis): Contains tasks divided into store units.
  4. Worker(Celery): Takes out a task from the broker and performs statistical indicators extraction.
  5. Outside DB(postgresql): Each worker reads the card transaction data from it.
  6. Own DB(mysql): Statistical indicators produced by each worker are stored in it.
  7. Retrial: When all tasks are completed, failed tasks are proceed once again. Then generated data is written to a csv file and uploaded to the client.

Role

Tech Stack

Results