Apache Commons CSV User GuideParsing filesParsing files with Apache Commons CSV is relatively straight forward. The CSVFormat class provides some commonly used CSV variants:
Example: Parsing an Excel CSV FileTo parse an Excel CSV file, write: Reader in = new FileReader("path/to/file.csv"); Iterable<CSVRecord> records = CSVFormat.EXCEL.parse(in); for (CSVRecord record : records) { String lastName = record.get("Last Name"); String firstName = record.get("First Name"); } Handling Byte Order MarksTo handle files that start with a Byte Order Mark (BOM) like some Excel CSV files, you need an extra step to deal with these optional bytes. You can use the BOMInputStream class from Apache Commons IO for example: final URL url = ...; try (final Reader reader = new InputStreamReader(new BOMInputStream(url.openStream()), "UTF-8"); final CSVParser parser = CSVFormat.EXCEL.builder() .setHeader() .build() .parse(reader)) { for (final CSVRecord record : parser) { final String string = record.get("SomeColumn"); ... } } You might find it handy to create something like this: /** * Creates a reader capable of handling BOMs. */ public InputStreamReader newReader(final InputStream inputStream) { return new InputStreamReader(new BOMInputStream(inputStream), StandardCharsets.UTF_8); } Working with headersApache Commons CSV provides several ways to access record values. The simplest way is to access values by their index in the record. However, columns in CSV files often have a name, for example: ID, CustomerNo, Birthday, etc. The CSVFormat class provides an API for specifying these header names and CSVRecord on the other hand has methods to access values by their corresponding header name.Accessing column values by indexTo access a record value by index, no special configuration of the CSVFormat is necessary:Reader in = new FileReader("path/to/file.csv"); Iterable<CSVRecord> records = CSVFormat.RFC4180.parse(in); for (CSVRecord record : records) { String columnOne = record.get(0); String columnTwo = record.get(1); } Defining a header manuallyIndices may not be the most intuitive way to access record values. For this reason it is possible to assign names to each column in the file:Reader in = new FileReader("path/to/file.csv"); Iterable<CSVRecord> records = CSVFormat.RFC4180.builder() .setHeader("ID", "CustomerNo", "Name") .build() .parse(in); for (CSVRecord record : records) { String id = record.get("ID"); String customerNo = record.get("CustomerNo"); String name = record.get("Name"); } Using an enum to define a headerUsing String values all over the code to reference columns can be error prone. For this reason, it is possible to define an enum to specify header names. Note that the enum constant names are used to access column values. This may lead to enums constant names which do not follow the Java coding standard of defining constants in upper case with underscores:public enum Headers { ID, CustomerNo, Name } Reader in = new FileReader("path/to/file.csv"); Iterable<CSVRecord> records = CSVFormat.RFC4180.builder() .setHeader(Headers.class) .build() .parse(in); for (CSVRecord record : records) { String id = record.get(Headers.ID); String customerNo = record.get(Headers.CustomerNo); String name = record.get(Headers.Name); } Header auto detectionSome CSV files define header names in their first record. If configured, Apache Commons CSV can parse the header names from the first record:Reader in = new FileReader("path/to/file.csv"); Iterable<CSVRecord> records = CSVFormat.RFC4180.builder() .setHeader() .setSkipHeaderRecord(true) .build() .parse(in); for (CSVRecord record : records) { String id = record.get("ID"); String customerNo = record.get("CustomerNo"); String name = record.get("Name"); } Printing with headersTo print a CSV file with headers, you specify the headers in the format: final Appendable out = ...; final CSVPrinter printer = CSVFormat.DEFAULT.builder() .setHeader("H1", "H2") .build() .print(out); To print a CSV file with JDBC column labels, you specify the ResultSet in the format: try (final ResultSet resultSet = ...) { final CSVPrinter printer = CSVFormat.DEFAULT.builder() .setHeader(resultSet) .build() .print(out); } |