You have log files from two or more applications, and you need to see them together. Viewing the data together in proper sequence will make it easier to correlate events, and listing them side-by-side in windows or tabs isn’t cutting it.
You need to merge log files by timestamps.
But just merging them by timestamp isn’t the only thing you need. Many log files have entries with more than one line, and not all of those lines have timestamps on them.
Merge Log Files by Timestamp
Let’s take a look at the simple case. We have two files from Linux's syslog daemon. One is the messages file and the other is the crontab log.
Here are four lines from the messages file:
Sep 4 00:00:08 ip–10–97–55–50 dhclient[2588]: XMT: Solicit on eth0, interval 120000ms. Sep 4 00:02:08 ip–10–97–55–50 dhclient[2588]: XMT: Solicit on eth0, interval 124910ms. Sep 4 00:04:13 ip–10–97–55–50 dhclient[2588]: XMT: Solicit on eth0, interval 109850ms. Sep 4 00:06:03 ip–10–97–55–50 dhclient[2588]: XMT: Solicit on eth0, interval 112380ms.
And here are five lines from cron:
Sep 4 00:01:01 ip–10–97–55–50 CROND[18843]: (root) CMD (run-parts /etc/cron.hourly) Sep 4 00:01:01 ip–10–97–55–50 run-parts(/etc/cron.hourly)[18843]: starting 0anacron Sep 4 00:01:01 ip–10–97–55–50 anacron[18853]: Anacron started on 2018–09–04 Sep 4 00:01:01 ip–10–97–55–50 anacron[18853]: Jobs will be executed sequentially< Sep 4 00:01:01 ip–10–97–55–50 anacron[18853]: Normal exit (0 jobs run)
When we’re only dealing with ten lines of logs, it’s easy to see where the merge belongs. The five lines in the cron log belong between the first and second lines of the messages log.
But with a bigger dataset, we need a tool that can merge these two files on the date and the time. The good news is that Linux has a tool for this already.
Merge Log Files With Sort
The sort command can, as its name implies, sort input. We can stream both log files into sort and give it a hint on how to sort the two logs.
Let’s give it a try.
cat messages.log cron.log |sort –key=1,2 > merge.log
This creates a new file named merge.log. Here’s what it looks like:
Sep 4 00:00:08 ip–10–97–55–50 dhclient[2588]: XMT: Solicit on eth0, interval 120000ms. Sep 4 00:01:01 ip–10–97–55–50 CROND[18843]: (root) CMD (run-parts /etc/cron.hourly) Sep 4 00:01:01 ip–10–97–55–50 anacron[18853]: Anacron started on 2018–09–04 Sep 4 00:01:01 ip–10–97–55–50 anacron[18853]: Jobs will be executed sequentially< Sep 4 00:01:01 ip–10–97–55–50 anacron[18853]: Normal exit (0 jobs run) Sep 4 00:01:01 ip–10–97–55–50 run-parts(/etc/cron.hourly)[18843]: starting 0anacron Sep 4 00:02:08 ip–10–97–55–50 dhclient[2588]: XMT: Solicit on eth0, interval 124910ms. Sep 4 00:04:13 ip–10–97–55–50 dhclient[2588]: XMT: Solicit on eth0, interval 109850ms. Sep 4 00:06:03 ip–10–97–55–50 dhclient[2588]: XMT: Solicit on eth0, interval 112380ms.
It worked!
Let’s dissect that command.
cat messages.log cron.log |
Cat concatenates files. We used it to send both logs to standard output. In this case, it sent messages.log first and then cron.log.
The pipe | is what it sounds like. It’s a pipe between two programs. It sends the contents of the two files to the next part of the command. As we’ll see below, sort can accept a single filename on the command line. When we want to sort more than one file, we use a pipe to send the files on standard input.
sort –key=2,3 > merge.log
Sort receives the contents of two files and sorts them. Its output goes to the > redirect operator, which creates the new file.
The most important part of this command is –key=2,3. We used this to tell sort to sort its input using two fields and three of the files. For some reason, sort starts counting fields at one instead of zero.
So sort was able to merge the two files using the day of the month and the timestamp.
This is our easy case. These log files both had single line entries, and our dataset was for less than thirty days. So we don't have to worry about sorting by months.
Let’s look at something that’s a little more complicated.
Merge Log Files With Multiline Entries
Here are a couple of Java application logs that we would like to merge.
Here’s the first:
2018-09-06 15:20:40,980 [INFO] Heimdall main:26 [main] Fix Engine is starting. 2018-09-06 15:20:45,639 [ERROR] AcceptorFactory createSessionSettings:92 [main] Session settings: [default] SocketAcceptPort=7000 ConnectionType=acceptor ValidateUserDefinedFields=N ValidateLengthAndChecksum=N ValidateFieldsOutOfOrder=N 2018-09-06 15:20:50,645 [ERROR] AcceptorFactory getSessionSettings:123 [main] Second Session settings: [default] SocketAcceptPort=7000 ConnectionType=acceptor ValidateUserDefinedFields=N ValidateLengthAndChecksum=N ValidateFieldsOutOfOrder=N 2018-09-06 15:21:45,653 [INFO] ThreadedSocketAcceptor startSessionTimer:291 [main] SessionTimer started 2018-09-06 15:21:47,711 [INFO] NetworkingOptions logOption:119 [main] Socket option: SocketTcpNoDelay=true 2018-09-06 15:21:59,919 [INFO] SendMessageToSolace addSession:51 [QF/J Session dispatcher: FIX.4.2:FOOU->TEST02] Adding session: FIX.4.2:FOOU->TEST02 2018-09-06 15:22:59,920 [INFO] MessageClient openTopic:422 [QF/J Session dispatcher: FIX.4.2:FOOU->TEST02] Opening FOO/DEV/AMER/FixEngine/Admin/*/TEST02 2018-09-06 15:23:59,937 [ERROR] ConsumerNodeStatusHandler setStateUp:186 [QF/J Session dispatcher: FIX.4.2:FOOU->TEST02] Setting State up: TEST02 2018-09-06 15:24:03,962 [INFO] MessageClient openTopic:422 [stateHeartbeat] Opening FOO/DEV/AMER/State/Admin/Events 2018-09-06 15:25:00,536 [INFO] incoming messageReceived:146 [NioProcessor-2] FIX.4.2:FOOU->TEST02: 8=FIX.4.29=6235=149=TEST0256=FOOU34=252=20180906-15:21:00.528112=TEST10=198
This log has a lot of whitespace and entries that span multiple lines.
Here’s the other:
2018-09-06 15:20:43:031 [INFO] com.foobar.atr.rest.controller.SessionStatusCache addSessionStatus():28 [lettuce-nioEventLoop-10-5] Adding session: TEST02 at 1536243961031 2018-09-06 15:20:46:031 [INFO] com.foobar.atr.rest.controller.SessionStatusCache addSessionStatus():28 [lettuce-nioEventLoop-13-4] Adding session: TEST02 at 1536243961031 2018-09-06 15:23:15:032 [INFO] com.foobar.atr.rest.controller.SessionStatusCache addSessionStatus():28 [lettuce-nioEventLoop-7-5] Adding session: TEST02 at 1536243961032 2018-09-06 15:24:35:257 [INFO] com.foobar.atr.rest.controller.StatusController getSessionStatus():67 [http-nio-8010-exec-4] Received request a fix session, senderCompId:RBSG2 2018-09-06 15:27:30:691 [INFO] com.foobar.atr.rest.controller.SessionStatusCache addSessionStatus():28 [lettuce-nioEventLoop-10-5] Adding session: PLOP02 at 1536244050691
This log is more uniform, with entries that only span a single line.
When we merge these two files, we want the multiline log message to remain together. So, sort's numeric sorting won’t work. We need a tool that's capable of associating the lines without timestamps with the last line that has one.
Unfortunately, no command line tool does this. We’re going to have to write some code.
A Merging Algorithm
Here’s an algorithm for merging log files that have multiline entries.
First, we need to preprocess the log files.
- Scan the log file line by line until we reach the end.
- If a line has a timestamp, save it and print the last saved line to a new file.
- If a line has no timestamp, append it to the saved line, after replacing the new line with a special character
- Continue with step #1.
We could do this in memory, but what happens when we’re dealing with huge log files? We’ll save the preprocessed log entries to disk so that this tool will work on huge log files.
After we perform this on both files, we have a new one that is full of single line entries. We’ll use the sort command to sort it for us, rather than reinventing the wheel. Then, we’ll replace the special characters with new lines, and we have a merged log file.
And we’re done!
Let's do it.
Merge Log Files With Python
We’ll use python. It’s available on all systems, and it’s easy to write a cross-platform tool that manipulates text files. I wrote the code for this article with version 2.7.14. You can find the entire script here on Github.
First, we need to process our input files.
parser = argparse.ArgumentParser(description="Process input and output file names") parser.add_argument("-f", "--files", help="list of input files", required=True, nargs='+') parser.add_argument("-o", "--output", help="output file", required=True, type=argparse.FileType('w')) args = parser.parse_args() line_regex = re.compile("^[^0-90-90-90-9\-0-90-9\-0-90-9]") with open("tmp.log", "w") as out_file: for filename in args.files: lastline = "" with open(filename, "r") as in_file: for line in in_file: if line_regex.search(line): lastline = lastline.rstrip('\n') lastline += '\1' lastline += line else: out_file.write(lastline) lastline = line
We'll start by processing command line arguments. This script accepts two:
- -f is a comma-separated list of input files
- -o is the name of the file to write the output to
Argparse gives us a list from the arguments passed to -f and opens the output file for us, as we’ll see below.
Python Regular Expressions
Then we'll create a regular expression. Let’s take a close look at it since this is what you’ll need to change if your logs are formatted differently.
Here’s the whole expression:
^[^0-90-90-90-9\-0-90-9\-0-90-9]
The expression starts with a caret ^. This means the beginning of a line.
But then we have this: [^ ] with some characters in the middle. Square brackets with a caret at the beginning mean not.
So the expression means "if this is not at the beginning of the line."
The pattern we're matching is inside the brackets.
0–90–90–90–9\-0–90–9\-0–90–9
Each 0–9 corresponds to a numeral. Each \- is a dash. So it could be read like this: NNNN-NN-NN. It’s a pattern for the date we see at the beginning of each log entry.
So in English, the expression means “if the line does not begin with a date.”
If you need to process logs with a different format, you'll need to change this. There's a guide to python regular expressions here.
Sorting the Results
Now, we'll start the real work.
- Open a temporary file.
- Open the first log file.
- Join lines with no timestamp to their predecessors, as described above.
- Repeat this for each file passed on the command line.
For the third step, we'll chop the newline '\n' from the end of the last line we saved. Then we'll add an SOH ('\1') character and concatenate the lines. (I could've done this in one line, but I spelled it out to make it clear.)
We're replacing newlines '\n' with the SOH character instead of NULLs ('\0') because nulls would confuse python's string processing libraries and we'd lose data.
Finally, the result of this code is a file named tmp.log that contains the log files preprocessed to be one line per entry.
Let’s finish the job.
sorted_logs = check_output(["/usr/bin/sort", "--key=1,2", "tmp.log"]) os.remove("tmp.log") lines = sorted_logs.split('\n') for line in lines: newline = line.replace('\1', '\n') args.output.write(newline + "\n")
Check_output executes an external command and captures the output.
So we'll use it to run sort on our temporary file and return the results to us as a string. Then, we'll remove the temporary file.
We wouldn’t want to capture the result in memory with a large file, but to keep this post short, I cheated. An alternative is to send the output of sort to a file with the -o option and then open that file and remove the special characters.
Next, we'll split the output on the new lines into an array. Then we'll process that array and undo the special characters. We'll write each line to the file opened for us by argparse.
We’re done!
Let's run this script on two files:
./mergelogs.py -f foo.log bar.log -o output.log
And we'll see this.
2018-09-06 15:20:40,980 [INFO] Heimdall main:26 [main] Fix Engine is starting. 2018-09-06 15:20:43:031 [INFO] com.foobar.atr.rest.controller.SessionStatusCache addSessionStatus():28 [lettuce-nioEventLoop-10-5] Adding session: TEST02 at 1536243961031 2018-09-06 15:20:45,639 [ERROR] AcceptorFactory createSessionSettings:92 [main] Session settings: [default] SocketAcceptPort=7000 ConnectionType=acceptor ValidateUserDefinedFields=N ValidateLengthAndChecksum=N ValidateFieldsOutOfOrder=N 2018-09-06 15:20:46:031 [INFO] com.foobar.atr.rest.controller.SessionStatusCache addSessionStatus():28 [lettuce-nioEventLoop-13-4] Adding session: TEST02 at 1536243961031 2018-09-06 15:20:50,645 [ERROR] AcceptorFactory getSessionSettings:123 [main] Second Session settings: [default] SocketAcceptPort=7000 ConnectionType=acceptor ValidateUserDefinedFields=N ValidateLengthAndChecksum=N ValidateFieldsOutOfOrder=N 2018-09-06 15:21:45,653 [INFO] ThreadedSocketAcceptor startSessionTimer:291 [main] SessionTimer started 2018-09-06 15:21:47,711 [INFO] NetworkingOptions logOption:119 [main] Socket option: SocketTcpNoDelay=true 2018-09-06 15:21:59,919 [INFO] SendMessageToSolace addSession:51 [QF/J Session dispatcher: FIX.4.2:FOOU->TEST02] Adding session: FIX.4.2:FOOU->TEST02 2018-09-06 15:22:59,920 [INFO] MessageClient openTopic:422 [QF/J Session dispatcher: FIX.4.2:FOOU->TEST02] Opening FOO/DEV/AMER/FixEngine/Admin/*/TEST02 2018-09-06 15:23:15:032 [INFO] com.foobar.atr.rest.controller.SessionStatusCache addSessionStatus():28 [lettuce-nioEventLoop-7-5] Adding session: TEST02 at 1536243961032 2018-09-06 15:23:59,937 [ERROR] ConsumerNodeStatusHandler setStateUp:186 [QF/J Session dispatcher: FIX.4.2:FOOU->TEST02] Setting State up: TEST02 2018-09-06 15:24:03,962 [INFO] MessageClient openTopic:422 [stateHeartbeat] Opening FOO/DEV/AMER/State/Admin/Events 2018-09-06 15:24:35:257 [INFO] com.foobar.atr.rest.controller.StatusController getSessionStatus():67 [http-nio-8010-exec-4] Received request a fix session, senderCompId:RBSG2
Log Files, Merged
In this tutorial, we covered how to merge log files, looking at a straightforward case and then a more complicated situation. The code for this is available on Github, and you're free to download and modify it for your individual needs.