Project Manager (Big Data) 8/24/2016
JOB DESCRIPTIONAPPLY We are looking for PROJECT MANAGER (BIG DATA) for our client in REDMOND, WA
JOB TITLE: PROJECT MANAGER (BIG DATA)
JOB LOCATION: REDMOND, WA
JOB TYPE: CONTRACT - 12 MONTHS / CONTRACT TO HIRE / DIRECT HIR E
"US citizens and those authorized to work in the US are encouraged to apply. We are unable to sponsor H1b candidates at this time."
* Build the Event Hubs integration with Service Fabric micro services implementation. Streaming the processed files from blobs into EH for downstream processing.
* Anonymized files (~1000 of them and to a size of ~GB) will be given as input
* Service Fabric code portion will be provided.
* Build the Spark processing reading off EventHubs, implementation in either Python or Scala would suffice.
* Look at the caching needs; leverage .cache to retain appropriate results from Spark 'Actions' in Spark executors
* Our team will evaluate a set of data store that would be a landing spot post Spark Blobs being a required one. We will pick 1 or 2 from this list -- SQL DW, Azure SQL DB, Cassandra and DocumentDB being other candidate stores and we will have code snippets and/or guidance
INTEGRATION & DEPLOYMENT
* Integrate the items from above with completed items (Azure Data Factory with ARM provisioning, picking up from the ADF pipeline which lands files onto blobs)
* Apply best practices for capacity planning, deployment for E2E
* Integrate the deployment with existing set of tools and processes.
* Build a unit test framework that can test each building block in isolation (ADF Blobs, Blobs Service Fabric, Service Fabric EH, EH Spark, Spark
* Build an E2E test environment with telemetry on latency, throughout with percentiles. *Leverage APM tools as appropriate