Is part of Data Platform who owns and is responsible for Data Platform which is Cloudera Data Platform (CDP) and its eco-systems both on-premise and on-Cloud
Understand business requirements from business MIS users in order to provision Data Platform to support
Responsible for design solution, implement, configuration, tuning Data Platform with robustness, performance, scalability, high availability, secured with high data quality
Design and create data ingestion framework, template for all data ingestion patterns such as structured, semi-structured, unstructured, streaming data using Nifi, Airflow, Python
Spark
Create ETL and data pipeline framework, template for all data patterns such as structured, semi-structured, unstructured data into various type of database and data store such as Hive,
HBase, Impala, MongoDB, graph database using batch and streaming mechanism leveraging on-premise and on-cloud Big Data architecture
Create framework, template of data serving layer for business users, business intelligence, downstream applications, external entities in various output format such as json, delimiter
file and APIs using Python with API framework such as Django, Flask
Able to develop programs / applications for multi-core / multi-threads architecture including
Advises on performance optimizations and best practices for scalable data models, pipelines and queries
Engineer data catalog, data lineage, business and technical metadata and build APIs service
Engineer security including authentication, authorization, data privacy for One Data Platform and its eco-system
Engineer data life cycle and data backup and restore
Works in agile methodology with other scrum team as well as in the SDLC life cycle type of projects
Conduct research and development on new / innovative ideas and proof-of-concepts with open-source software in order to modernize Krungsri One Data Platform
Support One Data Platform on problem solving by conducting root causes analysis, interim/short term/long term solution as well as preventive maintenance by working with other IT teams
and vendors
Automate ETL orchestration process with eco-system tools such as Oozie, Airflow
Engineer CI/CD for all code objects built on Krungsri One Data Platform to achieve high code qualify, automated testing and other best practices
Monitor, optimize, performance tuning for Dev, Test, Production and DR environment
Engineer Event-Based and Streaming data integration
Engineer On-Premise and On-Cloud environments
Model and Design modern data structure, SQL/NoSQL, Data Lake Cloud Storage with highest performance
Engineer platform for data science (similar to Kaggle / Colab)
Qualifications:
Bachelor’s degree in Computer Science, Applied Mathematics, Information Technology, Information Systems, Statistics, Engineering, or any other technology related field
Banking & Financial industry more than 5 years
Experience in Big Data architecture and its eco-system for both hardware, software, network and security
Experience working with varies forms of data infrastructure inclusive of relational databases such as SQL, Hadoop and Spark
Experience working on projects within a collaborative setting composed of cross-functional, technical, and non-technical personnel
Broad knowledge of different types of data storage engines – (non)relational, row/column oriented dbs.
Hands on experience with at least 2 of them – e.g. Postgres, MySQL, QB/redshift, elastic
Experience with orchestration tools (Airflow best fit)
Advanced query language (SQL) knowledge
Working with batch and real-time data processing
Strong coding skills in Python (preferred)
Strong knowledge in EDW/BI technologies industrial best practices
Please note that we will get in touch with shortlisted candidates only.
Dental insurance, Five-day work week, Flexible working hours, Free shuttle bus, Life insurance, Medical insurance, Performance bonus, Travel allowance, Work from home