A strong tool for gathering metrics from your applications and keeping an eye on your microservices is Prometheus. For Kubernetes clusters and any other program, Prometheus provides an excellent pod monitoring solution at a fine granularity for each pod.

Grafana is the visualization tool which takes metrics from Prometheus and then plots these same onto the dashboards so that you can monitor how your microservices are performing. You can also use this to create alerts for other dimensions such as uptime or error percentage.

Installation and Configuration

Step 1: Configuring the Prometheus Monitor

To configure a Prometheus monitor on Linux or macOS, follow these steps. Visit the official Prometheus platform website to view the most recent stable release. Then, use the following commands, replacing &LTversion> with the actual release version.

# Download Prometheus
wget https://github.com/prometheus/prometheus/releases/download/&LTversion>/prometheus-&LTversion>.linux-amd64.tar.gz

# Extract the downloaded archive
tar -xvf prometheus-&LTversion>.linux-amd64.tar.gz

# Navigate to the Prometheus directory
cd prometheus-&LTversion>.linux-amd64/

After this configure prometheus.yml file:

global:
  scrape_interval: 15s # How frequently to scrape metrics

scrape_configs:
  - job_name: 'microservices' # Name of the monitoring job
    static_configs:
      - targets: ['localhost:9090'] # Endpoint to monitor (replace with your target's address)

Start Prometheus:

./prometheus --config.file=prometheus.yml

Prometheus should be up and running at http://localhost:9090.

Step 2: Using Prometheus Pod Monitor

Create a PodMonitor resource to monitor the pods:

apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  name: microservices-pod-monitor
spec:
  selector:
    matchLabels:
      app: my-microservice
  namespaceSelector:
    matchNames:
    - my-namespace
  podMetricsEndpoints:
  - port: http

Step 3: Install Grafana for Visualization and Uptime Monitoring

To install Grafana run the following command, make sure to replace &LTversion> with the LTS version from Grafana official page:

# Download Grafana
wget https://dl.grafana.com/oss/release/grafana-&LTversion>.linux-amd64.tar.gz

# Extract the downloaded file
tar -zxvf grafana-&LTversion>.linux-amd64.tar.gz

# Move into the Grafana directory
cd grafana-&LTversion>/

Start Grafana by running the following command:

./bin/grafana-server

Grafana should be up and running at http://localhost:3000. To log in enter “admin” both for username and password.

Step 4: Connecting Grafana to Prometheus

  1. Navigate to Configuration > Data Sources in Grafana.
  2. Select Add Data Source > Prometheus.
  3. Enter the Prometheus (http://localhost:9090).
  4. Click Save & Test to confirm the connection.

Step 5: Creating Dashboards and Setting Up a Grafana Uptime Monitor

  1. Create a New Dashboard:
    • Go to Create > Dashboard.
    • Click Add new panel.
sum(rate(http_requests_total[5m])) by (service)

Set Up a Grafana Uptime Monitor: To create an uptime monitor, you can define a panel that tracks the status of your microservices:

up{job="microservices"}

Step 6: Building Alerts in Grafana

  1. Open the Panel you wish to set alerts on.
  2. Go to the Alert > Create Alert Rule.
  3. Define the alert conditions (e.g., alert if CPU usage exceeds 70% for more than 5 minutes).

Set up Notification Channels to receive alerts through preferred channels such as email or Slack.

Important Metrics to Monitor

For achieving maximum performance and reliability of your microservices, use the following metrics:

  • Request Rate: Graph the requests in order to identify uncommon spikes or dips over time.
  • Error Rate: Keep a tally of bad requests so you can detect problems before they create a bad experience for your users.
  • Latency: Measure a service’s request/response time and identify bottlenecks if latency is high.
  • Resource Usage. Ensure that the essential elements – CPU (processor), memory and disks – are not overburdened.
  • Uptime: Regularly check service uptime to ensure your services are consistently available.
Get the best from your microservice monitoring

Get real-time metrics, uptime monitoring, and automated alerts with Prometheus and Grafana. Need help scaling your monitoring stack?
Our engineers are ready to jump in.

Consult with an Engineer

Troubleshooting

When working with Prometheus and Grafana, you may encounter various issues, for example related to collecting the data, visual dashboard or alerts. Here’s the list of most frequent issues you may encounter.

1. Prometheus Troubleshooting

Issue: Prometheus Not Starting

Solutions:

  • Run promtool check config /path/to/prometheus.yml to validate the configuration file. Look for any syntax errors or misconfigurations.
  • Ensure that the default port (9090) used by Prometheus is not being used by another application. You can modify the port in the prometheus.yml file if necessary.
  • Make sure Prometheus has the required permissions to access the necessary files and directories.

Issue: Prometheus Not Scraping Targets

Solutions:

  • Check the prometheus.yml file to confirm that the target endpoints are correctly defined, including the right IP addresses and port numbers.

Check network connectivity. Use tools like curl or ping to test connectivity to your target endpoints from the Prometheus server. Ensure firewalls or network policies are not blocking the connections.

Issue: High Memory Usage

Solutions:

  • Adjust data retention. Consider reducing the --storage.tsdb.retention.time setting to lower the amount of data Prometheus retains.

Optimize queries. Review your queries for inefficiencies, and use functions like rate() or increase() to limit the volume of data processed.

2. Grafana Troubleshooting

Issue: Grafana dashboard not loading

Solutions:

  • Clear browser cache or use a different browser to see if the problem persists.

Check installed plugins (Configuration > Plugins) in Grafana and look for any plugins that are outdated or causing conflicts. Update or disable plugins if needed.

Issue: Data Sources Not Connecting

Solutions:

  • Verify data source settings. Go to Configuration > Data Sources in Grafana and make sure all the settings, including the URL, authentication credentials, and access method are correct.

Test connectivity. Use the Save & Test button in the Data Source settings to check if Grafana can connect to the data source.

Issue: Alerting Not Working

Solutions:

  • Review the alert rules. Go to Alerting > Alert Rules and verify that all alert conditions and thresholds are correctly configured.
  • Check notification channels. Make sure the notification channels are properly configured and that Grafana has access to external services.

Conclusion

Prometheus and Grafana together form a powerful duo for monitoring and managing modern applications, especially in environments where reliability, performance, and real-time insights are critical. Prometheus excels in collecting and storing detailed metrics across your entire system, providing a robust foundation for understanding what’s happening under the hood of your applications. With Grafana, on the other hand, it's possible to convert the complex data into visually intuitive dashboards that make it easier for your team to keep an eye on things in real time.

Therefore, by following the steps described in this article you will have a comprehensive solution that enables you to monitor the performance of your apps, identify any problems, and make sure everything continues as planned and functions effectively.

If you need more features in monitoring, check out ITSyndicate Monitoring Solutions.

Request more information today

Discover how our services can benefit your business. Leave your contact information and our team will reach out to provide you with detailed information tailored to your specific needs. Take the next step towards achieving your business goals.

Contact form:
Our latest news