Hi all,
I have written a shell script that collects server usage stats & outputs them in a structured format (CSV-style with commas). Now; I just redirect the output into a .csv file and open it in Excel but the formatting sometimes breaks especially when special characters / line breaks are involved. I would like to know if there’s a better way to generate Excel-compatible files directly from the shell.
Is there a reliable method / utility that can help convert plain text or CSV into a properly formatted .xlsx file using bash / a lightweight script? I am not looking for anything too heavy like Python+Pandas just a simple CLI tool / trick that keeps it readable in Excel without manual adjustments every time. I checked https://stackoverflow.com/questions/20913582/create-a-excel-file-using-shell-script guide related to this and found it quite informative.
While looking into what is Microsoft Excel and its file format structure; I realized I might be missing a better solution here. Any help or links to existing tools would be appreciated!
If special characters and line breaks are an issue, that suggests you are going outside the rules of CSV: or (better stated), you are not making allowance for the peculiarities of MicroSoft's usage and extensions to "their" form of CSV.
(1) Only ASCII characters are permitted. Even control characters (excepting CR and LF) are suspect, and NUL is taboo.
(2) Comma and Newline (LF) are part of the structure of CSV. They can occur in data fields, but the whole field must then be quoted, so " itself becomes part of the problem and must also be quoted.
Field quoting goes like this:
(a) The entire field must be enclosed in double quotes.
(b) Every occurrence of quotes in the field itself must be repeated.
Typically, a field like "Hello!", she said. is quoted like:
"""Hello!"", she said."
The slightly bizarre (b) requirement is to identify whether the field has actually ended. A field only ends when (i) There has been an even number of quotes in the field so far, and (ii) the following character is a comma or Newline or CR/LF pair.
The following character might be yet another quote, which continues the field.
Anything except a quote, comma or line terminator following an even number of quotes is invalid CSV.
There are languages that have CSV libraries (I believe Python will directly write Excel files, for example). Bash, C and Awk have no such assistance.