Forward syslog to Flume with rsyslog

Introduction

Syslog

In computing, syslog is a standard for message logging. It allows separation of the software that generates messages, the system that stores them, and the software that reports and analyzes them. Each message is labeled with a facility code, indicating the software type generating the message, and assigned a severity label.

Computer system designers may use syslog for system management and security auditing as well as general informational, analysis, and debugging messages. A wide variety of devices, such as printers, routers, and message receivers across many platforms use the syslog standard. This permits the consolidation of logging data from different types of systems in a central repository. Implementations of syslog exist for many operating systems.

Benefits of syslog

Helps analyze the root cause for any trouble or problem caused
Reduce overall downtime helping to troubleshoot issues faster with all the logs
Improves incident management by active detection of issues
Self-determination of incidents along with auto resolution
Simplified architecture with different level of severity like error,info,warning etc

In this post, I'll be using HDFS as central repository for syslogs and hive as analytical platform.

Rsyslog

Rsyslog is an open-source software utility used on UNIX and Unix-like computer systems for forwarding log messages in an IP network. It implements the basic syslog protocol, extends it with content-based filtering, rich filtering capabilities, flexible configuration options and adds features such as using TCP for transport.

Note:
Please review the post Streaming Twitter Data using Apache Flume before this one for Flume introduction and architechture.

Flume's syslog TCP source

The Syslog TCP source provides an endpoint for messages over TCP, allowing for a larger payload size and TCP retry semantics that should be used for any reliable inter-server communications.

To create a Syslog TCP source, set the type property to syslogtcp.
vi /usr/hadoopsw/apache-flume-1.7.0-bin/conf/syslog.conf

# Naming the components on the current agent.
agent.sources=SourceSyslog
agent.channels=ChannelMem
agent.sinks=SinkHDFS

# Describing/Configuring the source
#agent.sources.SourceSyslog.type=syslogudp
agent.sources.SourceSyslog.type=syslogtcp
agent.sources.SourceSyslog.host=0.0.0.0
agent.sources.SourceSyslog.port=12345
agent.sources.SourceSyslog.keepFields=true

# Describing/Configuring the channel
agent.channels.ChannelMem.type=memory
agent.channels.ChannelMem.capacity = 10000
agent.channels.ChannelMem.transactionCapacity = 1000

# Describing/Configuring the sink
agent.sinks.SinkHDFS.type=hdfs
agent.sinks.SinkHDFS.hdfs.path = /flume/syslogs/
agent.sinks.SinkHDFS.hdfs.fileType = DataStream
agent.sinks.SinkHDFS.hdfs.writeFormat = Text
agent.sinks.SinkHDFS.hdfs.batchSize = 1000
agent.sinks.SinkHDFS.hdfs.rollSize = 0
agent.sinks.SinkHDFS.hdfs.rollCount = 10000

# Binding the source and sink to the channel
agent.sources.SourceSyslog.channels = ChannelMem
agent.sinks.SinkHDFS.channel = ChannelMem

The keepFields property tells the source to include the syslog fields as part of the body.

By default, these are simply removed, as they become Flume header values. Memory channel Property capacity determines 10000 events and transactionCapacity determines maximum number of events that can be written, also called a put, by a source’s ChannelProcessor, the component responsible for moving data from the source to the channel, in a single transaction. This is also the number of events that can be read, also called a take, in a single transaction by the SinkProcessor, which is the component responsible for moving data from the channel to the sink.

Remember that if you increase capacity property value, you will most likely have to increase your Java heap space using the -Xmx, and optionally -Xms, parameters.

Run Flume agent

Create relevant folders in HDFS as mentioned in flume configuration

[hdpsysuser@te1-hdp-rp-nn01 ~]$ hdfs dfs -mkdir /flume/syslogs
[hdpsysuser@te1-hdp-rp-nn01 ~]$ hdfs dfs -chmod -R 777 /flume

Now you can run the flume using below command

flume-ng agent -n agent -c conf -f $FLUME_HOME/conf/syslogudp.conf - Dflume.root.logger=INFO,console

Test with nc

Now test the connection to your flume agent using nc. nc is the command which runs netcat, a simple Unix utility that reads and writes data across network connections, using the TCP or UDP protocol.

[hdpclient@en01 ~]$ nc localhost 12345
Event-1

Type above line and press enter, this line should be transported to Flume agent and it should write to HDFS location specified in the configuration file.

Test whether data reached to final destination (HDFS)

[hdpclient@en01 ~]$ hdfs dfs -cat /flume/syslogs/FlumeData.1497263656231
Event-1

Use Hive to analyze (Optional)

create database flume;
use flume;
create external table syslog(log_line string) location '/flume/syslogs'

select line
from syslog
--where line like '%user root%'
--where line like '%Invalid user%'
where lower(line) like '%authentication failure%'
limit 5;

Configure remote logging with rsyslog (Unix Client)

OS: Redhat 7.x

Configure the rsyslog to send rsyslog events to another server using TCP.

1- Add the following line to the RULES section of /etc/rsyslog.conf

# remote host is: name/ip:port, e.g. 192.168.0.1:514, port optional
#*.* @remote-host:514
*.*         @@10.10.10.1:514

vi /etc/rsyslog.conf

#### RULES ####

# Log all kernel messages to the console.
# Logging much else clutters up the screen.
#kern.* /dev/console

# Log anything (except mail) of level info or higher.
# Don't log private authentication messages!
*.info;mail.none;authpriv.none;cron.none /var/log/messages

# The authpriv file has restricted access.
authpriv.* /var/log/secure

# Log all the mail messages in one place.
mail.* -/var/log/maillog

# Log cron stuff
cron.* /var/log/cron

# Everybody gets emergency messages
*.emerg :omusrmsg:*

# Save news errors of level crit and higher in a special file.
uucp,news.crit /var/log/spooler

# Save boot messages also to boot.log
local7.* /var/log/boot.log

# remote host is: name/ip:port, e.g. 192.168.0.1:514, port optional
#*.* @remote-host:514
*.* @@192.168.44.134:12345

Note: You can add more than one hosts/agents in rsyslog.conf, message will be pushed to all servers in configuration

*.* @@192.168.44.138:12346
*.* @@192.168.44.138:12345

2- Restart rsyslog.

[hdpsysuser@te1-hdp-rp-dn04 ~]$ service rsyslog restart
Redirecting to /bin/systemctl restart rsyslog.service
==== AUTHENTICATING FOR org.freedesktop.systemd1.manage-units ===
Authentication is required to manage system services or units.
Authenticating as: hdpsysuser
Password:
==== AUTHENTICATION COMPLETE ===

3- Test the configuration using logger command. Logger makes entries in the system log.

[root@te1-hdp-rp-dn04 ~]# logger Test from Data Node - 4

Check your message

[root@te1-hdp-rp-dn04 ~]# tail /var/log/messages

Jun 12 14:18:59 te1-hdp-rp-dn04 fprintd: ** Message: entering main loop

Jun 12 14:19:19 te1-hdp-rp-dn04 su: (to root) hdpsysuser on pts/1

Jun 12 14:19:19 te1-hdp-rp-dn04 dbus[950]: [system] Activating service name='org.freedesktop.problems' (using servicehelper)

Jun 12 14:19:19 te1-hdp-rp-dn04 dbus-daemon: dbus[950]: [system] Activating service name='org.freedesktop.problems' (using servicehelper)

Jun 12 14:19:19 te1-hdp-rp-dn04 dbus[950]: [system] Successfully activated service 'org.freedesktop.problems'

Jun 12 14:19:19 te1-hdp-rp-dn04 dbus-daemon: dbus[950]: [system] Successfully activated service 'org.freedesktop.problems'

Jun 12 14:19:29 te1-hdp-rp-dn04 fprintd: ** Message: No devices in use, exit

Jun 12 14:19:39 te1-hdp-rp-dn04 hdpsysuser: Test from Data Node - 4

Jun 12 14:20:01 te1-hdp-rp-dn04 systemd: Started Session 11254 of user root.

Jun 12 14:20:01 te1-hdp-rp-dn04 systemd: Starting Session 11254 of user root.

4- Verify the flume agent's HDFS Location

[hdpclient@en01 ~]$ hdfs dfs -ls /flume/syslogs

Found 8 items

-rw-r--r-- 3 hdpclient supergroup 10 2017-06-12 13:46 /flume/syslogs/FlumeData.1497264415313

-rw-r--r-- 3 hdpclient supergroup 17 2017-06-12 13:50 /flume/syslogs/FlumeData.1497264697909

-rw-r--r-- 3 hdpclient supergroup 9 2017-06-12 13:51 /flume/syslogs/FlumeData.1497264761727

-rw-r--r-- 3 hdpclient supergroup 10 2017-06-12 13:55 /flume/syslogs/FlumeData.1497264970862

-rw-r--r-- 3 hdpclient supergroup 497 2017-06-12 14:14 /flume/syslogs/FlumeData.1497266092851

-rw-r--r-- 3 hdpclient supergroup 36 2017-06-12 14:16 /flume/syslogs/FlumeData.1497266217106

-rw-r--r-- 3 hdpclient supergroup 1272 2017-06-12 14:16 /flume/syslogs/FlumeData.1497266252522

-rw-r--r-- 3 hdpclient supergroup 176 2017-06-12 14:17 /flume/syslogs/FlumeData.1497266292306

Verify using browser

Check using Hive table

Query the hive table created earlier.

Monitor Flume metrics

You can configure the Flume agent to start an HTTP server that will output JSON that can use queries by outside mechanisms.

Start the Flume agent with these properties:
-Dflume.monitoring.type=http
-Dflume.monitoring.port=44444

flume-ng agent -n agent -c conf -f $FLUME_HOME/conf/syslog.conf - Dflume.root.logger=INFO,console -Dflume.monitoring.type=http -Dflume.monitoring.port=44444

Now, when you go to http://SERVER_OR_IP:44444/metrics

you will see something like below

{"CHANNEL.ChannelMem":{"ChannelCapacity":"1000000","ChannelFillPercentage":"0.0","Type":"CHANNEL","EventTakeSuccessCount":"14","ChannelSize":"0","EventTakeAttemptCount":"47","StartTime":"1497273770141","EventPutAttemptCount":"14","EventPutSuccessCount":"14","StopTime":"0"},"SINK.SinkHDFS":{"ConnectionCreatedCount":"2","ConnectionClosedCount":"2","Type":"SINK","BatchCompleteCount":"0","BatchEmptyCount":"31","EventDrainAttemptCount":"14","StartTime":"1497273770143","EventDrainSuccessCount":"14","BatchUnderflowCount":"2","StopTime":"0","ConnectionFailedCount":"0"},"SOURCE.SourceSyslog":{"EventReceivedCount":"14","AppendBatchAcceptedCount":"0","Type":"SOURCE","EventAcceptedCount":"14","AppendReceivedCount":"0","StartTime":"1497273770202","AppendAcceptedCount":"0","OpenConnectionCount":"0","AppendBatchReceivedCount":"0","StopTime":"0"}}

The channel’s ChannelSize or ChannelFillPercentage metrics will give you a good idea whether the data is coming in faster than it is going out. It will also tell you whether you have it set large enough for maintenance/outages of your data volume.

Looking at the sink, EventDrainSuccessCount versus EventDrainAttemptCount will tell you how often output is successful when compared to the times tried. ConnectionFailedCount metric is a good indicator of persistent connection problems.

A growing ConnectionCreatedCount metric can indicate that connections are dropping and reopening too often.

DBMentors - Inam Bukhari's Blog

Pages

Please see my other blog for Oracle EBusiness Suite Posts - EBMentors

Search This Blog

Friday, June 23, 2017