You are a hands-on dynamic individual with 3 to 6 years of deep spark/Scala streaming data development experience and are excited to lead our backend product development. You want to grow your skills at an exciting, nimble, responsive company. You are ready to put in the effort and time to get to the next level. You are an adventurer and a team player.
Whether it’swith your team, c-level executives or prospects, you can adapt to any situation.
What you will bring to the team and the company
An ideal combination of education and experience. You have a university or college degree. Your past experience includes extensive spark and Scala hands-on core development, process design, testing and architecture decisions. You know how to fully exploit the potential of Spark data streaming.
An appreciation for collaboration. You may be competitive in nature, but you have a deep appreciation for the team with whom you work. You are ready to be supported by, and in return, support the business of a dynamic development team.
A genuine and charismatic personality while being a keen problem solver.
A strong desire to win. You want to be on the winning team and you are driven to do whatever it takes.
What you will be doing
You will be in a lead development role for Data Sentinel. A high-growth software company that has developed a sensitive information intelligence platform that helps businesses to identify, inventory, categorize, track and trace sensitive data with the enterprise. We help companies know exactly what is in their data, no matter the source, the location, the type of data, or the scale. Our technology runs persistently within the business, constantly measuring data usage and placement against policies. We then trigger remediation actions, lowering risk while delivering compliant, governed and correct data back to the business.
You will be part of a product development team, reporting to the SVP of Engineering, with a goal of finding innovative solutions to processing and reading vast amounts of raw data from various systems and various formats using spark. This involves advanced data pipelines that will be embedded into our product.
- Design & develop Scala/Spark processes for data discovery
- Produce unit tests for Spark transformations and helper methods
- Write Scaladoc-style documentation with all code
- Design data processing pipelines
- Scala (with a focus on the functional programming paradigm)
- Apache Spark 2.x
- Apache Spark RDD API
- Apache Spark SQL DataFrame API
- Apache Spark Streaming API
- Containerization experience (docker & Kubernetes)
- Spark query tuning and performance optimization
- SQL database integration (Microsoft, Oracle, Postgres, and/or MySQL, etc)
- Experience working with HDFS, S3, Cassandra, and/or DynamoDB
- Experience with document processing under Spark Streaming
- Experience with Kafka & Zookeeper
- Understanding of distributed systems