Tutorial

Merging CSVs with mismatched columns — a practical guide

When your CSV files come from different sources, the columns rarely line up perfectly. Here's how to handle naming differences, extra columns, and missing fields — and merge everything into one clean file.

April 18, 20267 min read

The real-world problem

You have exported customer records from two different systems. One file has columns named first_name, last_name, and email_address. The other has FirstName, Surname, and Email. Both contain the same kind of data, but the column names are completely different.

A naive merge — just stacking the files on top of each other — will produce a file with six columns instead of three, because the tool treats each unique name as a separate column. Rows from the first file will have empty values in FirstName, Surname, and Email, and rows from the second file will have empty values in first_name, last_name, and email_address. The resulting file is unusable.

This is the mismatched columns problem, and it comes up constantly when you work with data from multiple sources: different CRMs, different regions, different teams, different eras of the same product.

Two approaches to merging

Simple merge (when headers already match)

If your files share the same column names — even if the values are in different orders, or there are some extra columns — a simple merge stacks rows from all files into one output. CsvKit aligns columns by name, so email in file A will land in the same column as email in file B even if their column positions differ.

Use simple merge when:

All files are exports from the same system
You have already standardised column names manually
The files have identical schemas (same columns, same names)

Mapped merge (when headers differ)

When column names differ across files, you need to tell the tool: "the column called Surname in file B is the same thing as last_name in file A." This is called column mapping, and it gives you precise control over the output schema.

With mapped merge you:

Define the target columns — what the output file's headers should be
For each input file, map each source column to one of the target columns (or mark it as skipped)

The result is a single clean file with exactly the columns you defined.

Key insight: Column mapping is a one-time setup. Once you have defined how two schemas relate, the merge runs in seconds regardless of how many rows the files contain.

Step-by-step: mapped merge with CsvKit

1 Upload your files

Go to csvkit.net/merge and upload each file you want to merge. You can add as many files as you need. Each one uploads independently with its own progress bar.

2 Choose Mapped Merge

Once all files are ready, click Configure Merge. On the configuration screen, select Mapped Merge. This unlocks the column mapping interface.

3 Define your target columns

In the Output Column Names section, add the columns you want in the final file. You can type them manually, or click Copy from [filename] to pull the headers directly from one of your uploaded files and use that as the starting point.

4 Map each file's columns

For each file, a mapping table appears. The left column shows the file's source columns. The right column is a dropdown where you choose which target column it maps to — or (skip) to exclude it from the output.

5 Set the output filename and run

Give your merged file a name (e.g. all_customers_merged.csv) and click Run Merge. When complete, a download button appears for the merged file.

Handling common scenarios

One file has extra columns the others don't

Include those columns in your target headers. For the other files, leave those columns unmapped — CsvKit will fill them with empty values for rows from files that do not have that column. This is the correct behaviour: it faithfully represents that the data was not available in those files.

You want to drop certain columns from all files

Simply do not include those columns in your target headers. Any source column that is not mapped to a target is automatically excluded from the output. There is no need to edit the original files.

Two columns from the same file should merge into one

This is not something a merge tool can handle — it is a data transformation problem. You would need to pre-process the file (for example, concatenating first_name and last_name into a full_name column) before uploading it to the merge tool.

Files have different date formats

CsvKit merges files as-is without modifying values. If file A has dates as 2024-01-15 and file B has 15/01/2024, the merged file will contain both formats in the same column. Date normalisation is a separate step best done in your database or with a data wrangling tool after the merge.

Tips for cleaner merges

Standardise before merging where possible. If you control the exports, configure them to use the same column names up front — it eliminates the mapping step entirely.
Use lowercase with underscores. Consistent naming conventions (snake_case) prevent case-sensitivity issues in databases and analytics tools that receive the merged file.
Keep a note of your mappings. If you merge these files regularly, document the source-to-target mapping so future you (or a colleague) can reproduce it quickly.
Check row counts after merging. The merged file should have a row count equal to the sum of all input files minus the header rows. If it is lower, a file may have been partially processed.

Summary

Mismatched columns are normal when merging data from different sources. Simple merge works when headers already align. Mapped merge gives you full control when they do not — you define the output schema and map each file's columns to it. The result is a single, clean file that is ready to import, analyse, or pass to the next stage of your pipeline.

Ready to try it yourself?

Free, no sign-up required. Works with files up to 50 GB.

Try CSV Split Try CSV Merge