Data Platform & Streaming Engineer
At Fluor, we are proud to design and build projects and careers. We are committed to fostering a welcoming and collaborative work environment that encourages big-picture thinking, brings out the best in our employees, and helps us develop innovative solutions that contribute to building a better world together. If this sounds like a culture you would like to work in, you’re invited to apply for this role.
Job Description
Role Overview
Fluor’s AI Development team is seeking a Data Platform & Streaming Engineer to design, build, and operate secure, scalable data platforms and real-time streaming pipelines that power AI/ML products and analytics across EPC (Engineering, Procurement, and Construction) programs. In this role, you will enable high-quality data ingestion, transformation, governance, and observability across batch and streaming workloads. You will collaborate with ML Engineers, product teams, and domain stakeholders to operationalize reliable datasets and event streams that support predictive insights, automation, and decision intelligence. The ideal candidate brings strong hands-on experience with distributed data systems, streaming frameworks, and cloud-native engineering practices.
Key Responsibilities
- Design and implement scalable data platform components (lake/lakehouse, data marts, event streams) to support AI/ML and analytics use cases.
- Build and maintain real-time and near-real-time streaming pipelines using tools such as Kafka / Azure Event Hubs, Spark Structured Streaming / Flink, and stream processing patterns.
- Develop robust batch ingestion and transformation pipelines (ETL/ELT) using Spark, SQL, and orchestration frameworks from SAP, Engineering systems, SuccessFactors and other enterprise systems.
- Implement data modeling standards (dimensional, Data Vault, medallion architecture) suitable for analytics and ML feature readiness.
- Ensure end-to-end data quality through validation rules, anomaly checks, schema evolution strategies, and automated testing.
- Operationalize pipelines with CI/CD, infrastructure-as-code, version control, and environment promotion standards.
- Establish observability (logging, metrics, tracing), SLOs, and incident response playbooks for data/streaming services.
- Apply data governance controls: lineage, cataloging, retention, access policies, encryption, and privacy-by-design.
- Optimize performance and cost across compute/storage by tuning jobs, partitioning strategies, caching, and streaming backpressure handling.
- Collaborate with AI/ML engineers to enable feature stores, training data pipelines, and online/offline consistency patterns.
- Interface with business/domain stakeholders (e.g., project controls, engineering, supply chain) to translate requirements into data products.
- Document architectures, runbooks, and standards; mentor junior engineers and promote engineering excellence.
Basic Job Requirements
- 5+ years of experience in data engineering, including streaming and distributed processing.
- Strong hands-on experience with streaming platforms (e.g., Kafka, Azure Event Hubs, Confluent, Pulsar) and patterns (event-driven architecture, CDC, exactly-once/at-least-once).
- Proficiency in Spark (PySpark/Scala) and SQL; experience with Spark Structured Streaming or equivalent.
- Experience building data platforms on cloud (preferably Azure): ADLS, Databricks, Synapse, Data Factory, Event Hubs, Functions & AKS
- Strong software engineering fundamentals: Python/Scala/Java, APIs, data structures, reliability patterns.
- Familiarity with data lakehouse concepts, file formats (Delta/Iceberg/Hudi, Parquet), and schema management.
- Experience with CI/CD (Azure DevOps/GitHub Actions), Git, and IaC (Terraform/Bicep/ARM).
- Understanding of security fundamentals: IAM/RBAC, secrets management, encryption, and compliance-aware data handling.
Other Job Requirements
Preferred Qualifications
- Experience implementing CDC using Debezium, Kafka Connect, or cloud CDC services.
- Knowledge of ML data enablement: feature engineering pipelines, feature stores, training/serving data consistency.
- Experience with data governance tooling: Purview, Data Catalog, lineage/metadata management.
- Exposure to containerization/orchestration (Docker, Kubernetes/AKS) for data services.
- Experience with time-series/IoT or industrial data streams (e.g., sensors, telemetry), or EPC domain datasets.
- Familiarity with test automation for data pipelines (Great Expectations, Deequ, custom frameworks) and data contract testing.
- Preferred (optional): Azure Data Engineer Associate, Databricks certifications, Kafka/Confluent certifications.
- Proven experience supporting real-time streaming workloads and platform reliability in enterprise environments.
To be Considered Candidates:
Must be authorized to work in the country where the position is located.
We are an equal opportunity employer. All qualified individuals will receive consideration for employment without regard to race, color, age, sex, sexual orientation, gender identity, religion, national origin, disability, veteran status, genetic information, or any other criteria protected by governing law.