Skip to content

[[TOC]]

Filebeat for application logging on legacy VMs

These are simple installation notes for running Filebeat on legacy VMs. Filebeat was requested by the Backoffice Team to start using Elastic as their default logging platform and move away from Google logging. Remains to be seen how much the use what.

Please note that in this setup Filebeat only collects application logs from Nginx and PHP-FPM. It is not monitoring the whole VM. For that a better way would be to install Elastic Agent and have it configured as part of a monitoring fleet.

Install and configure Filebeat

Mostly follow steps from https://www.elastic.co/downloads/beats/filebeat. Eg. (as root):

wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -
echo "deb https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-8.x.list
apt-get update
apt-get install -y filebeat
systemctl enable filebeat

Enable Nginx module

filebeat modules enable nginx

# enable access and error log parsing + edit default log locations if different from OS defaults
vi /etc/filebeat/modules.d/nginx.yml

Configure access to elasticsearch

First create a API key + metadata in Elastic like this example. Please note all logs from legacy infra should report to indexes starting with legacy-*.

{
  "dev-nbo-filebeat": {
    "cluster": [
      "monitor",
      "manage_ingest_pipelines",
      "manage_ilm"
    ],
    "indices": [
      {
        "names": [
           "legacy-dev-nbo"
        ],
        "privileges": [
      "read",
      "write",
      "create_index",
      "view_index_metadata"
        ]
      }
    ]
  }
}
{
  "created_by": "gnd",
  "responsible": "gnd",
  "created_on": "13.06.2023",
  "cluster": "legacy-infra",
  "desc": "API key for dev nbo legacy VM" 
}}
}
{
  "created_by": "gnd",
  "responsible": "gnd",
  "created_on": "13.06.2023",
  "cluster": "legacy-infra",
  "desc": "API key for dev nbo legacy VM" 
}

You will obtain an API key in the Beats form. Use it to configure Filebeat on the VM. Edit /etc/filebeat/filebeat.yml like this:

NOTE 10/2024: This might be slightly obsolete configuration since we started usin

filebeat.inputs:
- type: log
  paths:
    - /var/log/php8.2-fpm.log

filebeat.config.modules:
  # Glob pattern for configuration loading
  path: ${path.config}/modules.d/*.yml

  # Set to true to enable config reloading
  reload.enabled: false

### OLD Filebeat settings
#setup.template.name: "filebeat"
#setup.template.pattern: "filebeat"
#setup.template.settings:
#  index.number_of_shards: 1
#  #index.codec: best_compression
#  #_source.enabled: false
#
#output.elasticsearch:
#  index: "legacy-nbo"
#  hosts: ["https://ftmo-observability.es.europe-west3.gcp.cloud.es.io:9243"]
#  api_key: "some-api-key"
#
#setup.ilm:
#  enabled: true
#  ilm_check_exists: false

### NEW Filebeat settings (27/08/2024 and again 02.09.2024)
setup.ilm.enabled: false
setup.template.enabled: false
output.elasticsearch:
  index: "legacy-prod-nbo-v4"
  hosts: ["https://ftmo-observability.es.europe-west3.gcp.cloud.es.io:9243"]
  api_key: "some-api-key"

processors:
  - add_host_metadata:
      when.not.contains.tags: forwarded
  - add_cloud_metadata: ~
  - add_docker_metadata: ~
  - add_kubernetes_metadata: ~
  ```

# Verify that Filebeat works

See if the index (in this case: `legacy-dev-nbo`) was created and that there are no errors in the `/var/log/syslog`.
Check Kibana if you are able to see the data from the VM.

# Upgrading Filebeat Grok pattern

## Introduction

On NBO and Dev NBO we are using a custom Nginx access logformat, because of the NBO Team's reliance on an extra field called request_id. This value is generated by Nginx and passed as a header along with the request, doing some kind of Nginx-based tracing. 

However **this breaks** the default Grok pattern  in the ingest pipeline for Nginx access logs. This has to be fixed manually by extending the Grok pattern in the pipeline. The resulting Grok pattern looks like this: 

```yaml
- grok:
    field: event.original
    patterns:
    - (?:%{NGINX_ADDRESS_LIST:nginx.access.remote_ip_list}|%{NOTSPACE:source.address}) %{HOSTNAME:destination.domain}
      (-|%{DATA:user.name}) \[%{HTTPDATE:nginx.access.time}\] "%{DATA:nginx.access.info}"
      %{NUMBER:http.response.status_code:long} %{NUMBER:http.response.body.bytes:long}
      "(-|%{DATA:http.request.referrer})" "(-|%{DATA:user_agent.original})" "(-|%{DATA:request.id})"
    pattern_definitions:
      NGINX_HOST: (?:%{IP:destination.ip}|%{NGINX_NOTSEPARATOR:destination.domain})(:%{NUMBER:destination.port})?
      NGINX_NOTSEPARATOR: "[^\t ,:]+"
      NGINX_ADDRESS_LIST: (?:%{IP}|%{WORD})("?,?\s*(?:%{IP}|%{WORD}))*
    ignore_missing: true

You can change it in the file: /usr/share/filebeat/module/nginx/access/ingest/pipeline.yml

How to fix Grok pattern after Filebeat upgrade

As soon as a new Filebeat version is installed on the system as a result of a package upgrade, the extended Grok pattern is lost and replaced by a default one. This breaks logs ingestion and display in Kibana.

To fix this, use a file with the extended Grok pattern:

/usr/share/filebeat/module/nginx/access/ingest/pipeline_ftmo.yml

These are the steps to be taken on the VM:

1. systemctl stop filebeat
2. dpkg -la|grep filebeat (8.15.2)
3. In Kibana Stack management > Ingest Pipelines: delete filebeat-8.15.2-nginx-access-pipeline
4. cp /usr/share/filebeat/module/nginx/access/ingest/pipeline_ftmo.yml /usr/share/filebeat/module/nginx/access/ingest/pipeline.yml
5. systemctl start filebeat

Afterwards verify that new logs from access log are correctly ingested.

Nginx dashboard in Kibana

TBD