Skip to content

join Command Cheat Sheet

The join command joins lines of two files on a common field. It is essentially a relational database INNER JOIN operation for text files.

Important Prerequisite: Both input files MUST be sorted on the join field for join to work correctly.


Synopsis

join [OPTION]... FILE1 FILE2

Basic Usage

The Data

file1.txt (ID Name):

1 Alice
2 Bob
3 Charlie

file2.txt (ID Role):

1 Developer
3 Manager
4 Designer

Default Join (Inner Join)

Joins on the first field (default).

join file1.txt file2.txt
Output:
1 Alice Developer
3 Charlie Manager
(Bob is missing because ID 2 is not in file2. Designer is missing because ID 4 is not in file1).


Controlling Fields

Specific Join Field (-1, -2)

If the common ID is not the first column.

  • -1 2: Use field 2 of file 1.
  • -2 1: Use field 1 of file 2.
join -1 2 -2 1 employees.txt salaries.txt

Output Format (-o)

Select which columns to print. format: file_number.field_number.

# Print ID, Name (File1 Col2), Role (File2 Col2)
join -o 1.1 1.2 2.2 file1.txt file2.txt

Custom Delimiter (-t)

If files are CSV (comma separated).

join -t ',' file1.csv file2.csv

Join Types (Outer, Left)

Left Join (-a 1)

Include unpairable lines from file 1.

join -a 1 file1.txt file2.txt
Output:
1 Alice Developer
2 Bob
3 Charlie Manager

Full Outer Join (-a 1 -a 2)

Include unpairable lines from both files.

join -a 1 -a 2 file1.txt file2.txt

Fill Empty Fields (-e)

When doing outer joins, replace missing fields with a placeholder.

join -a 1 -a 2 -e "NULL" -o 0 1.2 2.2 file1.txt file2.txt
Output:
1 Alice Developer
2 Bob NULL
3 Charlie Manager
4 NULL Designer
(Note: -o 0 prints the join field).


Advanced: Ignoring Case (-i)

Ignore upper/lower case differences in the join field.

join -i file1.txt file2.txt

Troubleshooting

"join: file1.txt is not sorted"

The most common error. Use sort before join.

Process substitution (The Pro Way): Don't create temporary sorted files; sort them on the fly.

join <(sort file1.txt) <(sort file2.txt)

Exit Status

Code Meaning
0 Success
1 Error

Notes

  • join is incredibly fast for large datasets compared to using loops in awk or bash, provided the data is sorted.
  • Default separator is whitespace (space or tab).