
Some details
The aws glue python spark developers team at chrisranjana.com completed a python spark project.
The client wanted million row CSV files to be transformed into AWS REDSHIFT datawarehouse.
Very complex apache spark transformation scripts were created by our spark developers.
AWS glue was used to run the python spark scripts.
The glue job was triggered periodically using aws cloudwatch
cron.Custom aws glue classifiers were used. GROK glue classifiers were expertly created by our aws glue developers
The Structure of the CSV were complex containing embedded JSON data as well.
Our python developers team created the complex glue transformation scripts so that even for tens of millions of rows of data the process completed efficiently and accurately.