Home > netflow > Installing Elastiflow flow monitoring solution

Installing Elastiflow flow monitoring solution

Elastiflow is built upon the ELK stack so lets get that installed first. Now, we can install all of this on a single host (virtual or physical) for lab use, but for production use in high FPS environments, you will really want to scale the ELK stack horizontally to be able to process and search all your flow data. I am using RHEL for this guide so yum will be the package manager but it will work just as well on Ubuntu or other flavors with your package manager of choice. Elastiflow files and documentation can be found here:

https://github.com/robcowart/elastiflow

I am going to show you how to install Logstash on its own server and Elasticsearch/Kibana together on another. This will allow horizontal scaling of Logstash. You can just as easily separate Kibana on its own server as well if you plan on building Elasticsearch as a multi node cluster. If this is just going in a lab or you have low flows per second (<1000 or so) then you can install all components on a single server. Below is a diagram of this deployment:

Installing pre-reqs:

The following commands we will run on all our servers so we have java and the ELK repo added to our package manager. We first need to install java.

sudo yum -y install java-openjdk-devel java-openjdk

Next add the ELK repo and key.

sudo rpm –import https://artifacts.elastic.co/GPG-KEY-elasticsearch

cat <<EOF | sudo tee /etc/yum.repos.d/elasticsearch.repo
[elasticsearch-7.x]
name=Elasticsearch repository for 7.x packages
baseurl=https://artifacts.elastic.co/packages/7.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md
EOF

Installing Elasticsearch

We will start by installing Elasticsearch and setting it to start on boot.

sudo yum install -y elasticsearch
sudo systemctl daemon-reload
sudo systemctl enable elasticsearch.service

Lets modify some configs and start Elasticsearch. We will first modify the elasticsearch.yml file:

sudo nano /etc/elasticsearch/elasticsearch.yml

Modify the below variables as shown.

cluster.name: elastiflow
node.name: ${HOSTNAME}
bootstrap.memory_lock: true
network.host: 10.0.0.2 (whatever the IP is of your server, set to “localhost” if running a single server for ELK stack)
http.host: 10.0.0.2 (whatever the IP of your server is, not needed if running a single server for ELK stack)

Add the following additional variables as well to the bottom of the file:

indices.query.bool.max_clause_count: 8192
search.max_buckets: 100000

Since we enabled bootstrap.memory_lock so that Elasticsearch memory is never swapped to disk, we need to modify the service file as well since that is where systemd requires system limits to be specified:

sudo nano /usr/lib/systemd/system/elasticsearch.service

[Service]
LimitMEMLOCK=infinity

Lets also set some java memory options. By default it is set to a min/max of 1G which is fine for a lab but if you are going into production, you want Elasticsearch to have a large heap to be able to store its internal data structures. Good rule of thumb is 50% of total system RAM but no more than 32GB which is the threshold JVM uses for compressed object pointers. Also set the min/max values the same so JVM doesn’t resize the heap which is a costly operation.

My Elasticsearch server has 8 GB of RAM so I will set JVM to use 4 GB of it.

sudo nano /etc/elasticsearch/jvm.options

-Xms4G
-Xmx4G

We are now ready to start Elasticsearch:

sudo systemctl daemon-reload
sudo systemctl start elasticsearch.service

And we can verify Elasticsearch is running by sending a request to it locally from the command line:

[ed48@localhost ~]$ curl http://localhost:9200/_cluster/health?pretty
{
“cluster_name” : “elastiflow”,
“status” : “green”,
“timed_out” : false,
“number_of_nodes” : 1,
“number_of_data_nodes” : 1,
“active_primary_shards” : 0,
“active_shards” : 0,
“relocating_shards” : 0,
“initializing_shards” : 0,
“unassigned_shards” : 0,
“delayed_unassigned_shards” : 0,
“number_of_pending_tasks” : 0,
“number_of_in_flight_fetch” : 0,
“task_max_waiting_in_queue_millis” : 0,
“active_shards_percent_as_number” : 100.0

Installing Kibana

We will be installing Kibana, which is the web front end, on the same server as Elasticsearch but you can install on a seperate server as well and just point it to Elasticsearch.

sudo yum install -y kibana
sudo systemctl daemon-reload
sudo systemctl enable kibana.service

Next we will edit the kibana.yml config file to tell Kibana what IP to listen to web requests on as well as tell Kibana where the Elasticsearch server is.

sudo nano /etc/kibana/kibana.yml

server.host: “10.0.0.2”
elasticsearch.hosts: ["http://localhost:9200"]

Now we can start Kibana and after a minute or so you should be able to point your browser to the IP of your Kibana install on port 5601 and you should see the home page for Kibana.

sudo systemctl start kibana.service

Installing Logstash

Now lets move over to the other server at 10.0.0.1 to install Logstash. Remember, this server will be collecting flows as well as running that data through a custom pipeline to normalize all the data, add/merge fields etc so that it can be easily viewed in Kibana. Logstash is also the server that will most likely run out of resources first which is why we are installing it on its own server so that we can scale it horizontally. The sweet spot for this is around 4 CPU’s as adding more does not scale linearly. I am running 8 CPU’s with 8 GB of RAM (with 6 GB dedicated to JVM) and I currently get around 3000 – 3500 FPS without dropping packets due to the UDP input buffers being full. The UDP buffer in Linux is another reason why adding more CPU’s results in diminishing returns as Linux only assigns a single CPU core to handle UDP traffic. The rest of the CPU’s can be used to run Logstash workers to process the data which helps but there is a finite limit of how fast that single core can pull data off the network card.

If you haven’t yet, make sure Java is installed and the ELK key and repo are added to the package manager.

sudo yum -y install java-openjdk-devel java-openjdk

sudo rpm –import https://artifacts.elastic.co/GPG-KEY-elasticsearch

cat <<EOF | sudo tee /etc/yum.repos.d/elasticsearch.repo
[elasticsearch-7.x]
name=Elasticsearch repository for 7.x packages
baseurl=https://artifacts.elastic.co/packages/7.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md
EOF

Now we can install and enable Logstash to run on boot.

sudo yum install -y logstash

sudo systemctl enable logstash.service

Finally we will install or update the required plugins for Logstash.

sudo /usr/share/logstash/bin/logstash-plugin install logstash-codec-sflow
sudo /usr/share/logstash/bin/logstash-plugin update logstash-codec-netflow
sudo /usr/share/logstash/bin/logstash-plugin update logstash-input-udp
sudo /usr/share/logstash/bin/logstash-plugin update logstash-input-tcp
sudo /usr/share/logstash/bin/logstash-plugin update logstash-filter-dns
sudo /usr/share/logstash/bin/logstash-plugin update logstash-filter-geoip
sudo /usr/share/logstash/bin/logstash-plugin update logstash-filter-translate

Installing Elastiflow

Ok, now lets download all the files we will need for Elastiflow onto the Logstash server and get it configured. First lets create a temp directory to download the Elastiflow files. We will be installing release 4.0.0-beta1 You can see all official releases here:

https://github.com/robcowart/elastiflow/releases

sudo mkdir temp
cd temp
sudo wget https://github.com/robcowart/elastiflow/archive/v4.0.0-beta1.tar.gz

You should now have a file called v4.0.0-beta1.tar.gz in the temp directory. Go ahead and unzip it into the same directory.

sudo tar xvf v4.0.0-beta1.tar.gz

You should now have a directory named “elastiflow-4.0.0-beta1”

Before we go any further, I recommend you read over the install instructions for the release you are installing (Install.md) just to familiarize yourself with the process.

Now that we have most of the files we need, lets first copy the global variable file for Elastiflow to the logstash.service.d directory under systemd.

sudo cp -a elastiflow-4.0.0-beta1/logstash.service.d/. /etc/systemd/system/logstash.service.d/

Now lets copy the Elastiflow config files to the logstash directory.

sudo cp -a elastiflow-4.0.0-beta1/logstash/elastiflow/. /etc/logstash/elastiflow/

Now we need add the Elastiflow pipeline to the pipelines.yml file. This tells Logstash where all the config files are to run the custom Elastiflow processing pipeline. Make sure to comment out the default pipeline as it is not needed.

– pipeline.id: elastiflow
path.config: “/etc/logstash/elastiflow/conf.d/*.conf”

Here is what the file should look like. The formatting matters in this file so include the spaces as shown below.

Now lets set some of the global variables in the elastiflow.conf file.

sudo nano /etc/systemd/system/logstash.service.d/elastiflow.conf

First we will turn on DNS name resolution processing by setting the first variable below to true and enter the IP of your DNS resolver for the second.

Next we will set ELASTIFLOW_ES_HOST to the IP of our Elasticsearch instance. If we were running the ELK stack on a single host we would accept the default of 127.0.0.1:9200 (localhost).

That is all that needs to be modified so save the file and we are ready to create the Logstash startup script which will take the variables we just edited and use them in the actual config files for the Elastiflow pipeline. If you make more changes to elastiflow.conf later, make sure to rerun this script and restart Logstash.

sudo /usr/share/logstash/bin/system-install

One last thing is to modify the jvm.options config file to give logstash enough RAM to operate efficiently. Since we are doing DNS lookups 4GB is a good value to use. Remember to keep this value around 50% of your system RAM.

sudo nano /etc/logstash/jvm.options

And now we can start logstash

sudo systemctl daemon-reload
sudo systemctl start logstash.service

Lastly, we need to import the index pattern and template Elasticsearch will use to index the flow data and the dashboards Kibana will use to display data to you. This is all located in a single file within the elastiflow-master directory and can be imported through the Kibana Gui. In Kibana go to the Management -> Saved objects page and import the elastiflow.kibana.7.5.x.ndjson file located at elastiflow-master/kibana/

You should now be able to start sending netflow/sflow/ipfix data on port 2055 to the IP of your Logstash server and in a few minutes you should see data when your click on the Dashbords icon on the left side of the page.

If you don’t see any data after a while, you can check the logs under /var/log/logstash/logstash-plain.log and see if there are any issues. If you are receiving v9 netflow, it may take a few minutes (depending on your template update rate) for Logstash to recieve a template to be able to decode the flows.

Categories: netflow Tags: , , ,
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: