for large financial institutions. This is an end-to-end role where you'll start with presales and architecture - gathering requirements, designing solutions, establishing governance frameworks - then progress to implementing your designs through to MVP delivery.
Our Focus
: Banking and Financial Services clients with stringent regulatory requirements (Basel III, MAS TRM, PCI-DSS, GDPR). You'll architect data lake solutions for critical use cases like AML reporting, KYC data management, and regulatory compliance - ensuring robust data governance, metadata management, and data quality frameworks.
Your Impact
: Design end-to-end data architectures combining
GCP data services
(BigQuery, Dataflow, Data Catalog, Dataplex) with
on-premise systems
(ex. Oracle). Establish data governance frameworks with cataloging, lineage, and quality controls. Then build your designs - implementing data pipelines, governance tooling, and delivering working MVPs for mission-critical banking systems.
Duration:
Part-time long-term engagement with project-based allocations
Reporting:
Direct report to Head of Cloud
Objective
-------------
Design and deliver data lake solutions for banking clients on Google Cloud Platform:
Architecture Excellence
: Design data lake architectures, create technical specifications, lead requirements gathering and solution workshops
MVP Implementation
: Build your designs - implement data pipelines, deploy governance frameworks, deliver working MVPs with data quality
Data Governance
: Establish and implement comprehensive governance frameworks including metadata management, data cataloging, data lineage, and data quality standards
Client Success
: Own the full lifecycle from requirements to MVP delivery, ensuring secure, compliant, scalable solutions aligned with banking regulations and GCP best practices
Knowledge Transfer
: Create reusable architectural patterns, data governance blueprints, implementation code, and comprehensive documentation
KPI
-------
Design data architecture comprehensive documentation and governance framework
Deliver MVP from architecture to working implementation
Establish data governance implementations including metadata catalogs, lineage tracking, and quality monitoring
Achieve 80%+ client acceptance rate on proposed data architectures and technical specifications
Implement data pipelines with data quality and comprehensive monitoring
Create reusable architectural patterns and IaC modules for banking data lakes and regulatory reporting systems
Document solutions aligned with banking regulations (Basel III, MAS TRM, AML/KYC requirements)
Deliver cost models and ROI calculations for data lake implementations
Areas of Responsibility
---------------------------
Phase 1: Data Architecture & Presales
Elicit and document requirements for data lake, reporting systems, and analytics platforms
Design end-to-end data architectures: ingestion patterns, storage strategies, processing pipelines, consumption layers
Create architecture diagrams, data models (dimensional, data vault), technical specifications, and implementation roadmaps
Data Governance Design
: Design metadata management frameworks, data cataloging strategies, data lineage implementations, data quality monitoring
Evaluate technology options and recommend optimal GCP and On Premises data services for specific banking use cases
Calculate ROI, TCO, and cost-benefit analysis for data lake implementations
Banking Domain
: Design solutions for AML reporting, KYC data management, regulatory compliance, risk reporting
Hybrid Cloud Architecture
: Design integration patterns between GCP and on-premise platforms (ex. Oracle, SQL Server)
Security & compliance architecture: IAM, VPC Service Controls, encryption, data residency, audit logging
Participate in presales activities: technical presentations, client workshops, demos, proposal support
Create detailed implementation roadmaps and technical specifications for development teams
Phase 2: MVP Implementation & Delivery
Build production data pipelines based on approved architectures
Implement data warehouses: schema creation, partitioning, clustering, optimization, security setup
Deploy data governance frameworks: Data Catalog configuration, metadata tagging, lineage tracking, quality monitoring
Develop data ingestion patterns from on-premise systems
Write production-grade data transformation, validation, and business logic implementation
Develop Python applications for data processing automation, quality checks, and orchestration
Build data quality frameworks with validation rules, anomaly detection, and alerting
Create sample dashboards and reports for business stakeholders
Implement CI/CD pipelines for data pipeline deployment using Terraform
Deploy monitoring, logging, and alerting for data pipelines and workloads
Performance tuning and cost optimization for production data workloads
Document implementation details, operational runbooks, and knowledge transfer materials
Skills & Knowledge
-----------------------
Certifications & Core Platform:
GCP Professional Cloud Architect
(strong plus, not mandatory) - demonstrates GCP expertise
GCP Professional Data Engineer
(alternative certification)
Core GCP data services: BigQuery, Dataflow, Pub/Sub, Data Catalog, Dataplex, Dataform, Composer, Cloud Storage, Data Fusion
Must-Have Technical Skills:
Data Architecture
(expert level) - data lakes, lakehouses, data warehouses, modern data architectures
Data Governance
(expert level) - metadata management, data cataloging, data lineage, data quality frameworks, hands-on implementation
: AML (Anti-Money Laundering), KYC (Know Your Customer), regulatory reporting, risk management
Financial regulations
: Basel III, MAS TRM (Monetary Authority of Singapore Technology Risk Management), PCI-DSS, GDPR
Understanding of banking data flows, reporting requirements, and compliance frameworks
Experience with banking data models and financial services data architecture
Strong Plus:
On-premise data platforms: Oracle, SQL Server, Teradata
Data quality tools: Great Expectations, Soda, dbt tests, custom validation frameworks
Visualization tools: Looker, Looker Studio, Tableau, Power BI
Infrastructure as Code: Terraform for GCP data services
Streaming data processing: Pub/Sub, Dataflow streaming, Kafka integration
Vector databases and search: Vertex AI Vector Search, Elasticsearch (for GenAI use cases)
Communication:
Advanced English
(written and verbal)
Client-facing presentations, workshops, and requirement gathering sessions
Technical documentation and architecture artifacts (diagrams, specifications, data models)
Stakeholder management and cross-functional collaboration
Experience
--------------
7+ years
in data architecture, data engineering, or solution architecture roles
4+ years
hands-on with
GCP data services
(BigQuery, Dataflow, Data Catalog, Dataplex) - production implementations
3+ years
in
data governance
(MANDATORY) - metadata management, data lineage, data quality frameworks, data cataloging