Digital business is driving a fundamental shift to cloud-native applications, creating a new set of operational and performance challenges ill-suited to the currently available solutions. At OpsCruise, we imagine a world of autonomous operations and are innovating a fundamentally different approach to performance management. OpsCruise’s vision is to automate the performance assurance of cloud applications using a model-driven closed-loop platform. Our engineering culture values pragmatism, honesty, and simplicity to solve hard problems the right way.
The OpsCruise team represents a global and talented team that includes domain experts in IT Operations, Networking, Storage, Hyperscale Systems and AI/ML that have built market-leading solutions at companies such as Cisco, Google, Hitachi, HP, Infoblox, Oracle and VMWare among others.
Our engineering culture values creativity, pragmatism, honesty, and simplicity to solve hard problems the right way.
We are looking for a Data Scientist who will join our team to develop and implement a variety of machine learning (ML) based algorithms and models for predictive and prescriptive AIOps for our SaaS platform in the cloud AI/Ops space.
Our Technology Stack
Our product involves the following technology areas with one or more tools in use in each area:
- Container technologies, including creating Docker plugins and extensions
- Serverless technologies including instrumentation, addons
- Orchestrators including Kubernetes, OpenShift, Mesos, Swarm
- Metric generation and collection including Prometheus and tools like Dynatrace and Datadog
- Tracing including OpenTracing, Jaeger
- Graph tools and Databases including neo4j, JanusGraph, TinkerPop/Gremlin
- TimeSeries databases like Prometheus, OpenTSDB
- NoSQL and Indexing tools like MongoDB, Cassandra, Solr and Elastic
- Messaging tools including Kafka, Akka
- Big Data tools including HDFS, YARN, Spark, Flink
- AI/ML techniques including Statistical Analysis, Classification, Deep Learning, etc.
- Cloud services: AWS, GCP, and Azure, their services in databases, networking and ML tools
- High performance User Interfaces including AngularJS, Vue, D3.js and local stores
- Authentication and Authorization including tools like Okta and KeyCloak
- Research and test novel machine learning approaches for analysing large-scale distributed computing applications.
- Develop production-ready implementations of proposed solutions across different models AI and ML algorithms, including testing on live customer data to improve accuracy, efficacy, and robustness
- Work closely with other functional teams to integrate implemented systems into the SaaS platform
- Suggest innovative and creative concepts and ideas that would improve the overall platform
The ideal candidate must have the following qualifications:
- 5 + years experience in practical implementation and deployment of large customer-facing ML based systems.
- MS or M Tech (preferred) in applied mathematics/statistics; CS or Engineering disciplines are acceptable but must have with strong quantitative and applied mathematical skills
- In-depth working, beyond coursework, familiarity with classical and current ML techniques, both supervised and unsupervised learning techniques and algorithms
- Implementation experiences and deep knowledge of Classification, Time Series Analysis, Pattern Recognition, Reinforcement Learning, Deep Learning, Dynamic Programming and Optimization
- Experience in working on modeling graph structures related to spatiotemporal systems
- Programming skills in Python is a must
- Experience in developing and deploying on cloud (AWS or Google or Azure)
- Good verbal and written communication skills
- Familiarity with well-known ML frameworks such as Pandas, Keras, TensorFlow
Most importantly, you should be someone who is passionate about building new and innovative products that solve tough real-world problems.