Libove Blog

Personal Blog about anything - mostly programming, cooking and random thoughts

How To Convert a List of Objects to CSV in JavaScript

This is a pretty common task in frontend; you already have some data (retrieved from an API) and want to offer the same data as a CSV.

The problem: most of the top answers you find on Google and StackOverflow are wrong. They will break in the presence of , or " in the data. Additionally all I found are vulnerable to CSV injections.

Usage:

  • yourData is the data you want to convert to CSV.
  • In data you have to change two things
    • Change the array to list all headers you want to be included in your CSV
    • Change the value lookups in the body to match the headers.
  • csvStr will be the finished CSV content. You can use this to construct a download for the user.

Snippet:

const yourData = [
	{value_a: 12, value_b: "some data"},
	{value_a: null, value_b: "no data"},
]

const escape = (str) => {
	// handle empty cell
	if(str == null || str == '') return '""'
	// ensure string
	str = `${str}`
	// prevent CSV Injection https://owasp.org/www-community/attacks/CSV_Injection
	const forbidden = new Set(["=", "+", "-", "@", "\t", "\n", "\r"])
	if (forbidden.has(str[0])) {
		str = `'${str}`
	}
	// escape double quotes
	str = str.replace(/"/g, '""')
	return `"${str}"`
};

const data = [
	// header
	[
		"Header A",
		"Header B",
	].map(escape),
	// body
	...yourData.map(
		row => [
			row["value_a"],
			row["value_b"],
		].map(escape)
	)
]
const csvStr = data.map(x => x.join(",")).join("\n")

console.log(csvStr);

What is CSV

The naive assumption about CSV (comma separated values) is just appending data with commas as separators in between. This works in some special cases, but for general data this scheme will quickly break. When you have data which includes commas it will break your CSV.

In  [1, 2, 'hello, world']
CSV 1,2,hello, world
Out [1, 2, 'hello', ' world']

To avoid this quotes are used. Either all or only the values where it is needed are put into quotes. The reader will ignore all commas if they are inside quotes. This way we can represent data that includes commas.

In  [1, 2, 'hello, world']
CSV "1","2","3","hello, world"
Out [1, 2, 'hello, world']

But this only shifts the problem. What happens if our quote character is used in our data?

In  [1, 2, 'hello", John," what?']
CSV "1","2","3","hello", John," what?"
Out [1, 2, 'hello', ' John', ' what?']

To avoid this an escape character is used. The escape character is placed in front of every quote char, which is part of the data. This indicates to the reader that the following quote is part of the data. The most common format of CSV uses the quotechar itself as the escapechar. Every " in the data is simply doubled to "".

In  [1, 2, 'hello", John", what?']
CSV "1","2","3","hello"", John"", what?"
Out [1, 2, 'hello", John", what?']

The combination of "separator", "quotechar" and "escapechar" allows us to encode arbitrary data as CSV, without breaking the format of our data. In principle anything can be chosen for these three characters. One common variation is to use a semicolon (;) as the separator. If a tab (\t) is used, the files are called "TSV" (Tab separated values).