Boost your Prometheus with Intelligence and Automation
OpsCruise augments the popular open source (CNCF) Prometheus monitoring framework with contextual application structure and topology information, behavioral analysis, anomaly detection and automated fault isolation.
Originated by engineers at Soundcloud and Google, Prometheus is an open source, community-driven project for monitoring modern cloud-native applications and Kubernetes. Key drivers for the rapid adoption of Prometheus include the following:
Community adoption - Prometheus is a graduated CNCF project, with a rich community of contributors and users - more than 25,000 stars on GitHub - and the second-most used project in use after Kubernetes.
Growing ecosystem - Application and service developers are adding native Prometheus exporters to their Kubernetes estate. Popular open source visualization tools such as Grafana have built-in support for Prometheus dashboards.
Rich data model and query language - Prometheus incorporates a multi-dimensional data model for its time series metrics data and a flexible query language.
What’s Needed Beyond Prometheus for Enterprise Observability
More than Metrics
Each component includes provisioning, configuration, orchestration, execution, resource usage and performance. While metrics are a key source of real-time view into a container or service, they are not enough.
An application environment comprises multiple aspects of the application lifecycle. For example, Kubernetes events to map deployments to container instances and the resources allocated to them. Operating on the control plane, OpsCruise non-intrusively and securely ingests such data from the various sources without any impact on the applications.
A Rich Object Model
Metrics are associated with objects such as a container, pod, node, service, etc. OpsCruise understands their relevance by ingesting them from a variety of sources and uses curated knowledge to build a rich object model, and a dynamic object graph for visualization
The existence of objects are lost in the labels that come with the metrics. The high dimensionality is daunting and this leads to the sport of metric-spotting rather than understanding the many objects. For example, a Kafka cluster has consumers, producers, brokers, topics and partitions. OpsCruise recognizes them all and weaves them into one complete environment graph.
Around 35% of a microservice application is networking between the various components. OpsCruise provides a non-intrusive mechanism that adds these networking attributes to the object model to create a real-time interaction graph of all the elements that comprise the application.
Microservice applications are complex distributed systems .Therefore the interactions between the entities have to be identified and tracked in real-time by monitoring traffic flows. Who talks to whom is crucial to understand dependencies. As an example, a container is interacting with an IP address outside the cluster that IP address is attached to a AWS RDS MySQL instance. We can then realize that the container is using that specific database.
There are many metrics emitted by different subsystems. For example, VM Metrics alone number close to 640 unique metrics. OpsCruise provides a curated list that is known to matter for performance management, simplifying the training and effort by Operations staff.
While Prometheus collects, stores and queries metric well, there is no understanding of them. It falls on the SRE staff to write dashboards that have the knowledge embedded in them, creating a very fragile, and manual error-prone process -- all of which is automated by OpsCruise which inherently understands Prometheus and the underlying Kubernetes ecosystem.
Seamless Cloud Services
While the Kubernetes application metrics and objects are gathered via Prometheus, there are other objects related to the cloud services being used. OpsCruise brings those objects and their metrics into the same common real-time object graph and weaves them all together.
An application can include an ELB, RDS and Lambda in AWS. While the ELB will be recognized as an Ingress Kubernetes object, the same is not true for RDS and Lambda. OpsCruise leverages flow tracing and AWS APIs to discover, map and monitor them in one seamless environment.
Modern microservices applications have a high degree of dynamism. Looking back in time at an Issue is not so easy as objects may have disappeared. OpsCruise allows a user to go back in time to check what was in the environment at that time.
Objects are ephemeral. Nodes come and go, containers flash by, and IP addresses are created and assigned continuously. OpsCruise maintains snapshots of the environment which provide the complete application state, structure, metrics, and everything, at those times.
OpsCruise enables governance by providing a historical view of all participants and connections of the application. It also complies with InfoSec goals by ensuring PII and other needs are met.
OpsCruise ensures that the application is not touched. Only secure outbound connections are used. Integration with OAuth2 allows single sign-on with the organization’s preferred authentication methods. Role-Based Access Control (RBAC) helps isolate data access by function. Additionally, OpsCruise is embarking on the journey of becoming compliant with several standards starting with SOC2, GDPR and HIPAA. Finally, OpsCruise provides enterprise class phone and online technical support.