Big Data Developer
Nexus Cognitive
Software Engineering  
Posted on Oct 24, 2025
Description
 The Big Data Developer plays a key role in the modernization of data ecosystems, supporting the migration of legacy MAPR/Cloudera/Hortonworks applications to open-source frameworks compatible with NexusOne.
This individual will focus on refactoring, optimizing, and validating data processing pipelines, ensuring performance, scalability, and alignment with enterprise data standards.
The role requires strong technical expertise across distributed data systems, open-source frameworks, and hybrid data environments.
Core Responsibilities
- Analyze, refactor, and modernize Spark/MapReduce/Hive/Tez jobs for execution within NexusOne’s managed Spark and Trino environments.
 - Design, build, and optimize batch and streaming pipelines using Spark, NiFi, and Kafka.
 - Convert existing ETL jobs and DAGs from Cloudera/MAPR ecosystems to open-source equivalents.
 - Collaborate with Data Engineers and Architects to define new data ingestion and transformation patterns.
 - Tune performance across large-scale data processing workloads (partitioning, caching, resource allocation).
 - Implement data quality and validation frameworks to ensure consistency during migration.
 - Support code reviews, performance tests, and production readiness validation for migrated workloads.
 - Document conversion approaches, dependencies, and operational runbooks.
 - Partner with Wells Fargo application SMEs to ensure domain alignment and business continuity.
 
Requirements
 Key Skills & Tools
- Core Frameworks: Apache Spark, PySpark, Airflow, NiFi, Kafka, Hive, Iceberg, Oozie
 - Programming Languages: Python, Scala, Java
 - Data Formats & Storage: Parquet, ORC, Avro, S3, HDFS
 - Orchestration & Workflow: Airflow, DBT
 - Performance Optimization: Spark tuning, partitioning strategies, caching, YARN/K8s resource tuning
 - Testing & Validation: Great Expectations, Deequ, SQL-based QA frameworks
 - Observability & Monitoring: Datadog, Grafana, Prometheus
 
Ideal Background
- 4–8 years of experience in big data engineering or application modernization in enterprise settings.
 - Prior experience with Cloudera, MAPR, or Hadoop ecosystems, transitioning to open-source frameworks.
 - Strong understanding of distributed data architectures and data lake design principles.
 - Exposure to hybrid or cloud-native environments (AWS, GCP, or Azure).
 - Familiarity with regulated environments (financial services, telecom, healthcare) is a plus.
 
Success Criteria
- Successful refactoring and execution of legacy data pipelines within NexusOne environments.
 - Measurable performance improvements (execution time, cost optimization, data quality metrics).
 - Delivered migration artifacts — including conversion patterns, reusable scripts, and playbooks.
 - Positive feedback from Wells Fargo application owners on migration support and knowledge transfer.
 - Consistent adherence to coding standards, documentation, and change management practices.