Aleph-w 3.0
A C++ Library for Data Structures and Algorithms
Loading...
Searching...
No Matches
parse-csv.H File Reference

Comprehensive CSV (Comma-Separated Values) parsing and manipulation utilities. More...

#include <sstream>
#include <istream>
#include <ostream>
#include <fstream>
#include <string>
#include <tpl_array.H>
Include dependency graph for parse-csv.H:
This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Classes

class  Aleph::CsvRow
 A CSV row with header-based field access. More...
 
class  Aleph::CsvReader
 Lazy CSV reader for large files. More...
 
class  Aleph::CsvReader::Iterator
 

Namespaces

namespace  Aleph
 Main namespace for Aleph-w library functions.
 

Functions

Array< std::string > Aleph::csv_read_row (std::istream &in, char delimiter=',')
 Read a single CSV row from an input stream.
 
Array< std::string > Aleph::csv_read_row (const std::string &line, char delimiter=',')
 Read a single CSV row from a string.
 
Array< Array< std::string > > Aleph::csv_read_all (std::istream &in, char delimiter=',')
 Read all rows from a CSV input stream.
 
Array< Array< std::string > > Aleph::csv_read_file (const std::string &filename, char delimiter=',')
 Read all rows from a CSV file.
 
std::string Aleph::csv_escape (const std::string &field, char delimiter=',')
 Escape a string for CSV output.
 
void Aleph::csv_write_row (std::ostream &out, const Array< std::string > &row, char delimiter=',', const std::string &line_ending="\n")
 Write a CSV row to an output stream.
 
void Aleph::csv_write_all (std::ostream &out, const Array< Array< std::string > > &rows, char delimiter=',', const std::string &line_ending="\n")
 Write multiple CSV rows to an output stream.
 
void Aleph::csv_write_file (const std::string &filename, const Array< Array< std::string > > &rows, char delimiter=',', const std::string &line_ending="\n")
 Write CSV data to a file.
 
size_t Aleph::csv_num_columns (const Array< std::string > &row)
 Get the number of columns in a CSV row.
 
bool Aleph::csv_is_rectangular (const Array< Array< std::string > > &rows)
 Check if all rows have the same number of columns.
 
Array< std::string > Aleph::csv_get_column (const Array< Array< std::string > > &rows, size_t col_index)
 Get a column from CSV data.
 
template<typename T >
T Aleph::csv_to_number (const std::string &field)
 Convert a CSV field to a numeric type.
 
template<>
int Aleph::csv_to_number< int > (const std::string &field)
 
template<>
long Aleph::csv_to_number< long > (const std::string &field)
 
template<>
double Aleph::csv_to_number< double > (const std::string &field)
 
template<>
float Aleph::csv_to_number< float > (const std::string &field)
 
template<typename Pred >
Array< Array< std::string > > Aleph::csv_filter (const Array< Array< std::string > > &rows, Pred predicate)
 Filter CSV rows by a predicate.
 
Array< Array< std::string > > Aleph::csv_filter_by_value (const Array< Array< std::string > > &rows, size_t col_index, const std::string &value)
 Filter CSV rows by column value.
 
Array< Array< std::string > > Aleph::csv_select_columns (const Array< Array< std::string > > &rows, const Array< size_t > &col_indices)
 Select specific columns from CSV data.
 
Array< Array< std::string > > Aleph::csv_skip_rows (const Array< Array< std::string > > &rows, size_t n)
 Skip the first N rows of CSV data.
 
Array< Array< std::string > > Aleph::csv_take_rows (const Array< Array< std::string > > &rows, size_t n)
 Take only the first N rows of CSV data.
 
size_t Aleph::csv_count_rows (const Array< Array< std::string > > &rows)
 Count total number of rows.
 
size_t Aleph::csv_count_empty (const Array< Array< std::string > > &rows)
 Count empty fields in CSV data.
 
template<typename Pred >
size_t Aleph::csv_count_if (const Array< Array< std::string > > &rows, Pred predicate)
 Count rows matching a predicate.
 
template<typename Pred >
size_t Aleph::csv_find_row (const Array< Array< std::string > > &rows, Pred predicate)
 Find first row matching a predicate.
 
size_t Aleph::csv_find_by_value (const Array< Array< std::string > > &rows, size_t col_index, const std::string &value)
 Find row where column equals value.
 
Array< std::string > Aleph::csv_distinct (const Array< Array< std::string > > &rows, size_t col_index)
 Get distinct values in a column.
 
Array< Array< std::string > > Aleph::csv_transpose (const Array< Array< std::string > > &rows)
 Transpose CSV data (swap rows and columns).
 
Array< Array< std::string > > Aleph::csv_sort_by_column (const Array< Array< std::string > > &rows, size_t col_index, bool ascending=true)
 Sort CSV data by a column.
 
template<typename T >
Array< Array< std::string > > Aleph::csv_sort_by_column_numeric (const Array< Array< std::string > > &rows, size_t col_index, bool ascending=true)
 Sort CSV data by a column with numeric comparison.
 
Array< Array< std::string > > Aleph::csv_unique (const Array< Array< std::string > > &rows)
 Remove duplicate rows.
 
template<typename Func >
Array< Array< std::string > > Aleph::csv_transform (const Array< Array< std::string > > &rows, Func func)
 Apply a transformation to each field.
 
bool Aleph::csv_skip_bom (std::istream &in)
 Skip UTF-8 BOM if present.
 
Array< Array< std::string > > Aleph::csv_trim_fields (const Array< Array< std::string > > &rows)
 Trim whitespace from all fields.
 
Array< Array< std::string > > Aleph::csv_fill_empty (const Array< Array< std::string > > &rows, const std::string &default_value)
 Replace empty fields with a default value.
 
Array< Array< std::string > > Aleph::csv_join_horizontal (const Array< Array< std::string > > &left, const Array< Array< std::string > > &right)
 Join two CSV datasets horizontally (add columns).
 
Array< Array< std::string > > Aleph::csv_join_vertical (const Array< Array< std::string > > &top, const Array< Array< std::string > > &bottom)
 Join two CSV datasets vertically (add rows).
 
Array< Array< std::string > > Aleph::csv_inner_join (const Array< Array< std::string > > &left, size_t left_key_col, const Array< Array< std::string > > &right, size_t right_key_col)
 Inner join two CSV datasets by a key column.
 
Array< Array< Array< std::string > > > Aleph::csv_group_by (const Array< Array< std::string > > &rows, size_t col_index)
 Group rows by a column value.
 
template<typename Func >
Array< Array< std::string > > Aleph::csv_add_column (const Array< Array< std::string > > &rows, Func func)
 Add a new column with computed values.
 
Array< Array< std::string > > Aleph::csv_rename_column (const Array< Array< std::string > > &rows, const std::string &old_name, const std::string &new_name)
 Rename a column (in the header row).
 

Detailed Description

Comprehensive CSV (Comma-Separated Values) parsing and manipulation utilities.

This file provides a complete toolkit for parsing, generating, and manipulating CSV data following RFC 4180 conventions:

  • Fields are separated by a delimiter (default: comma).
  • Fields containing delimiters, quotes, or newlines are enclosed in double quotes.
  • Double quotes inside quoted fields are escaped by doubling them ("").
  • Lines end with CRLF or LF.

Features

Reading & Writing

  • csv_read_row() - Parse single row from stream or string
  • csv_read_all() - Read all rows into memory
  • csv_read_file() - Read from file
  • csv_write_row() / csv_write_all() / csv_write_file() - Write CSV output
  • CsvReader - Iterator-based lazy reading for large files

Header-Based Access

  • CsvRow - Access fields by column name: row["name"]

Filtering & Selection

  • csv_filter() - Filter rows by predicate
  • csv_filter_by_value() - Filter by column value
  • csv_select_columns() - Select specific columns
  • csv_skip_rows() / csv_take_rows() - Slice rows

Statistics & Search

  • csv_count_rows() / csv_count_empty() / csv_count_if()
  • csv_find_row() / csv_find_by_value()
  • csv_distinct() - Get unique values in column

Transformations

  • csv_transpose() - Swap rows and columns
  • csv_sort_by_column() / csv_sort_by_column_numeric()
  • csv_unique() - Remove duplicates
  • csv_transform() - Apply function to all fields
  • csv_add_column() - Add computed column
  • csv_rename_column() - Rename header column

Join & Combine

  • csv_join_horizontal() - Add columns side by side
  • csv_join_vertical() - Concatenate rows
  • csv_inner_join() - SQL-like join on key column
  • csv_group_by() - Group rows by column value

Utilities

  • csv_skip_bom() - Handle UTF-8 BOM
  • csv_trim_fields() - Remove whitespace
  • csv_fill_empty() - Replace empty fields
  • csv_escape() - Escape field for output
  • csv_to_number<T>() - Parse numeric fields

Usage Examples

#include <parse-csv.H>
// Simple reading
auto data = csv_read_file("data.csv");
// Lazy reading for large files
CsvReader reader("large.csv");
reader.read_header();
while (reader.has_next())
{
CsvRow row = reader.next_row();
std::cout << row["name"] << ": " << row.get<int>("value") << "\n";
}
// Filter and transform
auto active = csv_filter_by_value(data, 2, "active");
auto sorted = csv_sort_by_column_numeric<int>(active, 1);
csv_write_file("output.csv", sorted);
Array< Array< std::string > > csv_read_file(const std::string &filename, char delimiter=',')
Read all rows from a CSV file.
Definition parse-csv.H:290
Comprehensive CSV (Comma-Separated Values) parsing and manipulation utilities.

Thread Safety

All functions are thread-safe as long as different threads operate on different streams. Concurrent access to the same stream requires external synchronization.

Author
Leandro Rabindranath León

Definition in file parse-csv.H.