The split command in Linux allows you to split files into multiple files. There are several ways you can customize parameters for your given application. I’ll show you some examples of the split command that will help you understand its usage.
To help you learn about the split command I am using a relatively large text file containing 17170 lines and 1.4 MB in size. You can download a copy of this file from the GitHub link below.
Note that I will not directly display output in these examples because of the large file sizes. I will use the ll and wc commands to highlight file changes.
I advise you to have a quick look at the wc command to understand the output of the split command examples.
Examples of Split command in Linux
This is the syntax of the Split command:
split [options] filename [prefix]
Let’s see how to use it to split files in Linux.
1. Split files into multiple files
By default, split command creates new files for each 1000 lines. If no prefix is specified, it will use ‘x’. The letters that follow enumerate the files therefore xaa comes first, then xab, and so on.
Let’s split the sample log file:
split someLogFile.log
If you use the ls command, you can see multiple new files in your directory.
chris@discodingo:~/Documents$ ls
someLogFile.log xab xad xaf xah xaj xal xan xap xar
xaa xac xae xag xai xak xam xao xaq
You can use wc to quickly check the line counts after splitting.
chris@discodingo:~/Documents$ wc -l xaa xaq xar
1000 xaa
1000 xaq
170 xar
Remember from earlier that we saw our initial file had 17,170 lines. So we can see our program has done as expected by creating 18 new files. 17 of them are filled with 1000 lines each, and the last one has the remaining 170 lines.
Another way that we can demonstrate what is happening is to run the command with the verbose option. If you’re unfamilar with verbose, you are missing out! It provides more detailed feedback about what your system is doing and it is available to use with many commands.
split someLogFile.log --verbose
You can see what’s going on with your command on the display:
creating file 'xaa'
creating file 'xab'
creating file 'xac'
creating file 'xad'
creating file 'xae'
creating file 'xaf'
creating file 'xag'
creating file 'xah'
creating file 'xai'
creating file 'xaj'
creating file 'xak'
creating file 'xal'
creating file 'xam'
creating file 'xan'
creating file 'xao'
creating file 'xap'
creating file 'xaq'
creating file 'xar'
2. Split files into multiple files with specific line numbers
I understand that you might not like that files are split into files of 1000 lines. You can changes this behavior with -l
option.
When this is added, you can now specify how many lines you want in each of the new files.
split someLogFile.log -l 500
As you can guess, now the split files have 500 lines each, except the last one.
chris@discodingo:~/Documents$ wc -l xbh xbi
500 xbh
170 xbi
Now you have many more files, but with half as many lines in each one.
3. Split the files into n number of files
The -n
option makes splitting into a designated number of pieces or chunks easy. You can assign how many files you want by adding an integer value after -n.
split someLogFile.log -n 15
Now you can see that there are 15 new files.
chris@discodingo:~/Documents$ ls
someLogFile.log xaa xab xac xad xae xaf xag xah xai xaj xak xal xam xan xao
4. Split files with custom name prefix
What if you want to use split but keep the original name of my file or make a new name altogether instead of using ‘x’?
You may remember seeing the prefix as part of the syntax described in the beginning of the article. You can write your own custom file name after the source file.
split someLogFile.log someSeparatedLogFiles.log_
Here are the split files with names starting with the given prefix.
chris@discodingo:~/Documents$ ls
someLogFile.log someSeparatedLogFiles.log_aj
someSeparatedLogFiles.log_aa someSeparatedLogFiles.log_ak
someSeparatedLogFiles.log_ab someSeparatedLogFiles.log_al
someSeparatedLogFiles.log_ac someSeparatedLogFiles.log_am
someSeparatedLogFiles.log_ad someSeparatedLogFiles.log_an
someSeparatedLogFiles.log_ae someSeparatedLogFiles.log_ao
someSeparatedLogFiles.log_af someSeparatedLogFiles.log_ap
someSeparatedLogFiles.log_ag someSeparatedLogFiles.log_aq
someSeparatedLogFiles.log_ah someSeparatedLogFiles.log_ar
someSeparatedLogFiles.log_ai
5. Split and Specify Suffix Length
Split features a default suffix length of 2 [aa, ab, etc.]. This will change automatically as the number of files increases, but if you would like to manually change it, that is possible too. So let’s say you want our files to be named something like someSeparatedLogFiles.log_aaaab.
How can you do this? The option -a
allows us to specify the length of the suffix.
split someLogFile.log someSeparatedLogFiles.log_ -a 5
And here are the split files:
chris@discodingo:~/Documents$ ls
someLogFile.log someSeparatedLogFiles.log_aaaae someSeparatedLogFiles.log_aaaaj someSeparatedLogFiles.log_aaaao
someSeparatedLogFiles.log_aaaaa someSeparatedLogFiles.log_aaaaf someSeparatedLogFiles.log_aaaak someSeparatedLogFiles.log_aaaap
someSeparatedLogFiles.log_aaaab someSeparatedLogFiles.log_aaaag someSeparatedLogFiles.log_aaaal someSeparatedLogFiles.log_aaaaq
someSeparatedLogFiles.log_aaaac someSeparatedLogFiles.log_aaaah someSeparatedLogFiles.log_aaaam someSeparatedLogFiles.log_aaaar
someSeparatedLogFiles.log_aaaad someSeparatedLogFiles.log_aaaai someSeparatedLogFiles.log_aaaan
6. Split with numeric order suffix
Up to this point, you have seen your files separated using different letter combinations. Personally, I find it much easier to distinguish files using numbers.
Let’s keep the suffix length from the previous example, but change the alphabetical organization to numeric with the option -d
.
split someLogFile.log someSeparatedLogFiles.log_ -a 5 -d
So now you will have split files with numerical suffices.
chris@discodingo:~/Documents$ ls
someLogFile.log someSeparatedLogFiles.log_00004 someSeparatedLogFiles.log_00009 someSeparatedLogFiles.log_00014
someSeparatedLogFiles.log_00000 someSeparatedLogFiles.log_00005 someSeparatedLogFiles.log_00010 someSeparatedLogFiles.log_00015
someSeparatedLogFiles.log_00001 someSeparatedLogFiles.log_00006 someSeparatedLogFiles.log_00011 someSeparatedLogFiles.log_00016
someSeparatedLogFiles.log_00002 someSeparatedLogFiles.log_00007 someSeparatedLogFiles.log_00012 someSeparatedLogFiles.log_00017
someSeparatedLogFiles.log_00003 someSeparatedLogFiles.log_00008 someSeparatedLogFiles.log_00013
7. Append hex suffixes to split files
Another option for suffix creation is to use in the built-in hex suffix which alternates ordered letters and numbers.
For this example, I will combine a few things I’ve already shown you. I will split the file using my own prefix. I chose an underscore for readability purposes.
I used the -x
option to create a hex suffix. Then I split our file into 50 chunks and gave the suffix a length of 6.
split someLogFile.log _ -x -n50 -a6
And here is the outcome of the above command:
chris@discodingo:~/Documents$ ls
_000000 _000003 _000006 _000009 _00000c _00000f _000012 _000015 _000018 _00001b _00001e _000021 _000024 _000027 _00002a _00002d _000030
_000001 _000004 _000007 _00000a _00000d _000010 _000013 _000016 _000019 _00001c _00001f _000022 _000025 _000028 _00002b _00002e _000031
_000002 _000005 _000008 _00000b _00000e _000011 _000014 _000017 _00001a _00001d _000020 _000023 _000026 _000029 _00002c _00002f someLogFile.log
8. Split files into multiple files of specific size
It’s also possible to use file size to break up files in split. Maybe you need to send a large file over a size-capped network as efficiently as possible. You can specify the exact size for your requirements.
The syntax can get a little tricky as we continue to add options. So, I will explain how the -b
command works before showing the example.
When you want to create files of a specific size, use the -b
option. You can then write nK[B], nM[B], nG[B] where n is the value of your file size and K [1024] is -kibi, M is -mebi, G is -gibi, and so on. KB [1000] is kilo, MB – mega etc.
It may look like there is a lot going on, but it’s not that complex when you break it down. You have specified the source file, our destination filename prefix, a numeric suffix, and separation by file size of 128kB.
split someLogFile.log someSeparatedLogFiles.log_ -d -b 128KB
Here are the split files:
chris@discodingo:~/Documents$ ls
someLogFile.log someSeparatedLogFiles.log_02 someSeparatedLogFiles.log_05 someSeparatedLogFiles.log_08
someSeparatedLogFiles.log_00 someSeparatedLogFiles.log_03 someSeparatedLogFiles.log_06 someSeparatedLogFiles.log_09
someSeparatedLogFiles.log_01 someSeparatedLogFiles.log_04 someSeparatedLogFiles.log_07 someSeparatedLogFiles.log_10
You can verify the result with the ‘wc’ command.
chris@discodingo:~/Documents$ wc someSeparatedLogFiles.log_0*
1605 4959 128000 someSeparatedLogFiles.log_00
1605 4969 128000 someSeparatedLogFiles.log_01
1605 4953 128000 someSeparatedLogFiles.log_02
1605 4976 128000 someSeparatedLogFiles.log_03
1605 4955 128000 someSeparatedLogFiles.log_04
1605 4975 128000 someSeparatedLogFiles.log_05
1605 4966 128000 someSeparatedLogFiles.log_06
1605 4964 128000 someSeparatedLogFiles.log_07
1605 4968 128000 someSeparatedLogFiles.log_08
1605 4959 128000 someSeparatedLogFiles.log_09
16050 49644 1280000 total
9. Split files into multiple files of ‘At Most’ size n with
If you wanted to split files into roughly the same size, but preserve the line structure, this might be the best choice for you. With -C
, you can specify a maximum size. Then the program will automatically split the files based on complete lines.
split someLogFile.log someNewLogFiles.log_ -d -C 1MB
You can see in the output that the first split file is of nearly 1MB in size where as the rest of the file is in the second file.
chris@discodingo:~/Documents$ ll
total 2772
drwxr-xr-x 2 chris chris 81920 Jul 24 22:01 ./
drwxr-xr-x 19 chris chris 4096 Jul 23 22:23 ../
-rw-r--r-- 1 chris chris 1369273 Jul 20 17:52 someLogFile.log
-rw-r--r-- 1 chris chris 999997 Jul 24 22:01 someNewLogFiles.log_00
-rw-r--r-- 1 chris chris 369276 Jul 24 22:01 someNewLogFiles.log_01
Bonus Tip: Rejoining split files
This isn’t a split command, but it might be helpful for new users.
chris@discodingo:~/Documents$ ls
xaa xab xac xad xae xaf xag xah xai xaj xak xal xam xan xao xap xaq xar
You can use another command to rejoin those files and create a replica of our complete document. The cat command is short for concatenate which is just a fancy word that means “join items together”. Since all of the files begin with the letter ‘x’, the asterisk will apply the command to any files that begin with that letter.
chris@discodingo:~/Documents$ cat x* > recoveredLogFile.log
chris@discodingo:~/Documents$ ls
recoveredLogFile.log xab xad xaf xah xaj xal xan xap xar
xaa xac xae xag xai xak xam xao xaq
As you can see, our recreated file is the same size as our original.
wc -l recreatedLogFile.log
17170 recreatedLogFile.log
Our formatting (including the number of lines) is preserved in the file created.
If you’re new to Linux, I hope this tutorial helped you in understanding the split command. If you are more experienced tell us your favorite way to use split in the comments below!