All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
Classes | Public Types | Public Member Functions | Static Public Attributes | Private Types | Private Member Functions | Static Private Member Functions | Private Attributes | List of all members
icarus::details::KeyedCSVparser Class Reference

Parser to fill a KeyValuesData structure out of a character buffer. More...

#include <KeyedCSVparser.h>

Classes

struct  InvalidFormat
 
struct  MissingValues
 
struct  ParserError
 

Public Types

using ParsedData_t = icarus::KeyValuesData
 
using Error = icarus::KeyValuesData::Error
 Base of all errors by KeyedCSVparser. More...
 
using ErrorOnKey = icarus::KeyValuesData::ErrorOnKey
 
using MissingSize = KeyValuesData::MissingSize
 Expected number of values is missing. More...
 

Public Member Functions

 KeyedCSVparser (char sep= ',')
 Constructor: specifies the separator character. More...
 
template<typename String >
auto makeBuffer (String const &s) noexcept-> Buffer_t
 
template<char... Chars>
auto stripRightChars (SubBuffer_t s) noexcept-> SubBuffer_t
 
template<typename BIter , typename EIter >
auto parse (BIter b, EIter e) const -> ParsedData_t
 
ParsedData_t parse (std::string_view const &s) const
 Parses the buffer s and returns a data structure with the content. More...
 
ParsedData_t parse (std::string const &s) const
 
template<typename BIter , typename EIter >
ParsedData_t parse (BIter b, EIter e) const
 
ParsedData_t operator() (std::string_view const &s) const
 
ParsedData_t operator() (std::string const &s) const
 
template<typename BIter , typename EIter >
ParsedData_t operator() (BIter b, EIter e) const
 
void parse (std::string_view const &s, ParsedData_t &data) const
 Parses the buffer s and fills data with it. More...
 
Know patterns

The parser normally treats as a value everything that does not start with a letter. Known patterns may override this behaviour: if a token matches a known pattern, it is considered a key and it is possible to specify the expected number of values.

The number of values can be:

  • a number: exactly that number of values are required; an exception will be thrown if not enough tokens are available;
  • FixedSize: the next token must be a non-negative integer specifying how many other values to add (read this with Item::getSizedVector()); an exception will be thrown if not enough tokens are available;
  • DynamicSize: the standard algorithm is used and values are added as long as they don't look like keys; the token matching the pattern is interpreted as a key though.

Patterns are considered in the order they were added.

KeyedCSVparseraddPattern (std::regex pattern, unsigned int values)
 Adds a single known pattern. More...
 
KeyedCSVparseraddPattern (std::string const &pattern, unsigned int values)
 
KeyedCSVparseraddPatterns (std::initializer_list< std::pair< std::regex, unsigned int >> patterns)
 Adds known patterns. More...
 
KeyedCSVparseraddPatterns (std::initializer_list< std::pair< std::string, unsigned int >> patterns)
 

Static Public Attributes

static constexpr unsigned int FixedSize = std::numeric_limits<unsigned int>::max()
 Expected values are missing. More...
 
static constexpr unsigned int DynamicSize = FixedSize - 1U
 Mnemonic size value used in addPattern() calls. More...
 

Private Types

using Buffer_t = std::string_view
 
using SubBuffer_t = std::string_view
 

Private Member Functions

std::size_t findTokenLength (Buffer_t const &buffer) const noexcept
 Returns the length of the next toke, up to the next separator (excluded). More...
 
SubBuffer_t peekToken (Buffer_t const &buffer) const noexcept
 Returns the value of the next token, stripped. More...
 
SubBuffer_t extractToken (Buffer_t &buffer) const noexcept
 Extracts the next token from the buffer and returns its value, stripped. More...
 
bool isKey (SubBuffer_t const &buffer) const noexcept
 Is content of buffer a key (as opposed to a value)? More...
 

Static Private Member Functions

template<typename String >
static Buffer_t makeBuffer (String const &s) noexcept
 
static Buffer_tmoveBufferHead (Buffer_t &buffer, std::size_t size) noexcept
 
static SubBuffer_t strip (SubBuffer_t s) noexcept
 
static SubBuffer_t stripLeft (SubBuffer_t s) noexcept
 
static SubBuffer_t stripRight (SubBuffer_t s) noexcept
 
static SubBuffer_t stripRightChar (SubBuffer_t s, char c) noexcept
 
template<char... Chars>
static SubBuffer_t stripRightChars (SubBuffer_t s) noexcept
 

Private Attributes

char const fSep = ','
 Character used as token separator. More...
 
std::vector< std::pair
< std::regex, unsigned int > > 
fPatterns
 List of known patterns for matching keys, and how many values they hold. More...
 

Detailed Description

Parser to fill a KeyValuesData structure out of a character buffer.

It currently supports only single-line buffer.

The parser operates one "line" at a time, returning a KeyValuesData with the values assigned to each detected key. No data type is implied: all elements are treated as strings, either a key or a value. The parser separates the elements according to a separator, strips them of trailing and heading spaces, then it decides whether each element is a value to be assigned to the last key found, or a new key. Keys are elements that have letters in them, values are anything else. This simple (and arguable) criterion can be broken with specific parser configuration: a pattern can be specified that when matched to an element will make it a key; the pattern can also set the number of values that key will require.

For example:

parser.addPatterns({
{ "TriggerType", 1U } // expect one value (even if contains letters)
, { "TriggerWindows", 1U } // expect one value (even if contains letters)
// the first value is an integer, count of how many other values
});
"TriggerType, S5, Triggers, TriggerWindows, 0C0B,"
" TPChits, 12, 130, 0, 0, TPChitTimes, 3, -1.1, -0.3, 0.1, PMThits, 8"
);

will return data with 6 items.

Definition at line 68 of file KeyedCSVparser.h.

Member Typedef Documentation

using icarus::details::KeyedCSVparser::Buffer_t = std::string_view
private

Definition at line 164 of file KeyedCSVparser.h.

Base of all errors by KeyedCSVparser.

Definition at line 75 of file KeyedCSVparser.h.

Definition at line 76 of file KeyedCSVparser.h.

Expected number of values is missing.

Parsing format is not understood.

Definition at line 80 of file KeyedCSVparser.h.

Definition at line 72 of file KeyedCSVparser.h.

using icarus::details::KeyedCSVparser::SubBuffer_t = std::string_view
private

Definition at line 165 of file KeyedCSVparser.h.

Constructor & Destructor Documentation

icarus::details::KeyedCSVparser::KeyedCSVparser ( char  sep = ',')
inline

Constructor: specifies the separator character.

Definition at line 91 of file KeyedCSVparser.h.

91 : fSep(sep) {}
char const fSep
Character used as token separator.

Member Function Documentation

KeyedCSVparser& icarus::details::KeyedCSVparser::addPattern ( std::regex  pattern,
unsigned int  values 
)
inline

Adds a single known pattern.

Parameters
patternthe regular expression matching the key for this pattern
valuesthe number of values for this pattern
Returns
this parser (addPattern() calls may be chained)

Definition at line 141 of file KeyedCSVparser.h.

142  { fPatterns.emplace_back(std::move(pattern), values); return *this; }
std::vector< std::pair< std::regex, unsigned int > > fPatterns
List of known patterns for matching keys, and how many values they hold.
KeyedCSVparser& icarus::details::KeyedCSVparser::addPattern ( std::string const &  pattern,
unsigned int  values 
)
inline

Definition at line 143 of file KeyedCSVparser.h.

144  { return addPattern(std::regex{ pattern }, values); }
KeyedCSVparser & addPattern(std::regex pattern, unsigned int values)
Adds a single known pattern.
auto icarus::details::KeyedCSVparser::addPatterns ( std::initializer_list< std::pair< std::regex, unsigned int >>  patterns)

Adds known patterns.

Parameters
patternssequence of patterns to be added
Returns
this parser (addPatterns() calls may be chained)

Each pattern is a pair key regex/number of values, like in addPattern().

Definition at line 114 of file KeyedCSVparser.cxx.

116 {
117  for (auto& pattern: patterns) fPatterns.emplace_back(std::move(pattern));
118  return *this;
119 } // icarus::details::KeyedCSVparser::addPatterns()
std::vector< std::pair< std::regex, unsigned int > > fPatterns
List of known patterns for matching keys, and how many values they hold.
auto icarus::details::KeyedCSVparser::addPatterns ( std::initializer_list< std::pair< std::string, unsigned int >>  patterns)

Definition at line 124 of file KeyedCSVparser.cxx.

126 {
127  for (auto& pattern: patterns)
128  fPatterns.emplace_back(std::regex{ pattern.first }, pattern.second);
129  return *this;
130 } // icarus::details::KeyedCSVparser::addPatterns()
std::vector< std::pair< std::regex, unsigned int > > fPatterns
List of known patterns for matching keys, and how many values they hold.
auto icarus::details::KeyedCSVparser::extractToken ( Buffer_t buffer) const
privatenoexcept

Extracts the next token from the buffer and returns its value, stripped.

Definition at line 165 of file KeyedCSVparser.cxx.

166 {
167 #if 1
168  auto const start = cbegin(buffer), bend = cend(buffer);
169  std::size_t const length = findTokenLength(buffer);
170  moveBufferHead(buffer, length + ((start + length == bend)? 0: 1));
171  return strip({ start, length });
172 #else
173 
174  auto const start = cbegin(buffer), bend = cend(buffer);
175  auto finish = start;
176  while (finish != bend) {
177  if (*finish == fSep) break;
178  ++finish;
179  } // for
180 
181  // update the start of the buffer
182  std::size_t const tokenLength = std::distance(start, finish);
183  moveBufferHead(buffer, tokenLength + ((finish == bend)? 0: 1));
184 
185  return strip({ start, tokenLength });
186 #endif
187 } // icarus::details::KeyedCSVparser::extractToken()
auto cbegin(FixedBins< T, C > const &) noexcept
Definition: FixedBins.h:567
char const fSep
Character used as token separator.
auto cend(FixedBins< T, C > const &) noexcept
Definition: FixedBins.h:579
double distance(geo::Point_t const &point, CathodeDesc_t const &cathode)
Returns the distance of a point from the cathode.
static Buffer_t & moveBufferHead(Buffer_t &buffer, std::size_t size) noexcept
std::size_t findTokenLength(Buffer_t const &buffer) const noexcept
Returns the length of the next toke, up to the next separator (excluded).
static SubBuffer_t strip(SubBuffer_t s) noexcept
std::size_t icarus::details::KeyedCSVparser::findTokenLength ( Buffer_t const &  buffer) const
privatenoexcept

Returns the length of the next toke, up to the next separator (excluded).

Definition at line 141 of file KeyedCSVparser.cxx.

142 {
143 
144  auto const start = cbegin(buffer), bend = cend(buffer);
145  auto finish = start;
146  while (finish != bend) {
147  if (*finish == fSep) break;
148  ++finish;
149  } // for
150 
151  return std::distance(start, finish);
152 } // icarus::details::KeyedCSVparser::findTokenLength()
auto cbegin(FixedBins< T, C > const &) noexcept
Definition: FixedBins.h:567
char const fSep
Character used as token separator.
auto cend(FixedBins< T, C > const &) noexcept
Definition: FixedBins.h:579
double distance(geo::Point_t const &point, CathodeDesc_t const &cathode)
Returns the distance of a point from the cathode.
bool icarus::details::KeyedCSVparser::isKey ( SubBuffer_t const &  buffer) const
privatenoexcept

Is content of buffer a key (as opposed to a value)?

Definition at line 192 of file KeyedCSVparser.cxx.

193 {
194 
195  return !buffer.empty() && std::isalpha(buffer.front());
196 
197 } // icarus::details::KeyedCSVparser::isKey()
template<typename String >
static Buffer_t icarus::details::KeyedCSVparser::makeBuffer ( String const &  s)
staticprivatenoexcept
template<typename String >
auto icarus::details::KeyedCSVparser::makeBuffer ( String const &  s) -> Buffer_t
noexcept

Definition at line 202 of file KeyedCSVparser.cxx.

204  { return { data(s), size(s) }; } // C++20: use begin/end constructor
std::size_t size(FixedBins< T, C > const &) noexcept
Definition: FixedBins.h:561
then echo File list $list not found else cat $list while read file do echo $file sed s
Definition: file_to_url.sh:60
auto icarus::details::KeyedCSVparser::moveBufferHead ( Buffer_t buffer,
std::size_t  size 
)
staticprivatenoexcept

Definition at line 210 of file KeyedCSVparser.cxx.

211 {
212 
213  size = std::min(size, buffer.size());
214  return buffer = { buffer.data() + size, buffer.size() - size };
215 
216 } // details::KeyedCSVparser::eatBufferHead()
std::size_t size(FixedBins< T, C > const &) noexcept
Definition: FixedBins.h:561
ParsedData_t icarus::details::KeyedCSVparser::operator() ( std::string_view const &  s) const
inline

Definition at line 100 of file KeyedCSVparser.h.

100 { return parse(s); }
ParsedData_t parse(std::string_view const &s) const
Parses the buffer s and returns a data structure with the content.
then echo File list $list not found else cat $list while read file do echo $file sed s
Definition: file_to_url.sh:60
ParsedData_t icarus::details::KeyedCSVparser::operator() ( std::string const &  s) const
inline

Definition at line 101 of file KeyedCSVparser.h.

101 { return parse(s); }
ParsedData_t parse(std::string_view const &s) const
Parses the buffer s and returns a data structure with the content.
then echo File list $list not found else cat $list while read file do echo $file sed s
Definition: file_to_url.sh:60
template<typename BIter , typename EIter >
ParsedData_t icarus::details::KeyedCSVparser::operator() ( BIter  b,
EIter  e 
) const
inline

Definition at line 103 of file KeyedCSVparser.h.

103 { return parse(b, e); }
ParsedData_t parse(std::string_view const &s) const
Parses the buffer s and returns a data structure with the content.
do i e
auto icarus::details::KeyedCSVparser::parse ( std::string_view const &  s) const
inline

Parses the buffer s and returns a data structure with the content.

Definition at line 236 of file KeyedCSVparser.h.

237  { ParsedData_t data; parse(s, data); return data; }
ParsedData_t parse(std::string_view const &s) const
Parses the buffer s and returns a data structure with the content.
then echo File list $list not found else cat $list while read file do echo $file sed s
Definition: file_to_url.sh:60
icarus::KeyValuesData ParsedData_t
auto icarus::details::KeyedCSVparser::parse ( std::string const &  s) const

Definition at line 135 of file KeyedCSVparser.cxx.

136  { return parse(std::string_view{ s.data(), s.size() }); }
ParsedData_t parse(std::string_view const &s) const
Parses the buffer s and returns a data structure with the content.
then echo File list $list not found else cat $list while read file do echo $file sed s
Definition: file_to_url.sh:60
template<typename BIter , typename EIter >
ParsedData_t icarus::details::KeyedCSVparser::parse ( BIter  b,
EIter  e 
) const
void icarus::details::KeyedCSVparser::parse ( std::string_view const &  s,
ParsedData_t data 
) const

Parses the buffer s and fills data with it.

Definition at line 22 of file KeyedCSVparser.cxx.

23 {
24 
25  auto stream = s;
26 
27  ParsedData_t::Item* currentItem = nullptr;
28 
29  // this many tokens will be assigned to the current key:
30  int forcedValues = -1; // 0 would force the first entry to be a key
31 
32  while (!stream.empty()) {
33 
34  auto const token = extractToken(stream);
35 
36  std::string tokenStr { cbegin(token), cend(token) };
37 
38  bool bKey = false;
39  do {
40 
41  // if there are values pending, this is not a key, period;
42  // if all required values have been assigned, the next token is a key.
43  if (forcedValues >= 0) {
44  bKey = (forcedValues == 0); // if no more forced values, next is key
45  --forcedValues;
46  // if we know the token is a key, we still need to check for matching
47  // patterns to assign required values
48  if (!bKey) break;
49  }
50 
51  // the token may still be a key (if `bKey` is true, it is for sure: we can
52  // decide that a non-key (!bKey) is actually a key, but not the opposite)
53  for (auto const& [ pattern, values ]: fPatterns) {
54  if (!std::regex_match(begin(token), end(token), pattern)) continue;
55  bKey = true; // matching a pattern implies this is a key
56  std::string const& key = tokenStr;
57  // how many values to expect:
58  switch (values) {
59  case FixedSize: // read the next token immediately as fixed size
60  {
61  if (stream.empty()) throw MissingSize(key);
62 
63  auto const sizeToken = peekToken(stream);
64  if (empty(sizeToken)) throw MissingSize(key);
65 
66  // the value is loaded in `forcedValues` and already excludes
67  // the size token just read
68  char const *b = begin(sizeToken), *e = end(sizeToken);
69  if (std::from_chars(b, e, forcedValues).ptr != e)
70  throw MissingSize(key, std::string{ sizeToken });
71 
72  ++forcedValues; // the size will be forced in the values anyway
73 
74  } // FixedSize
75  break;
76  case DynamicSize:
77  // nothing to do, the normal algorithm rules will follow
78  break;
79  default:
80  forcedValues = values;
81  break;
82  } // switch
83  break;
84  } // for pattern
85  if (bKey) break;
86 
87  // let the "standard" pattern decide
88  bKey = isKey(token);
89 
90  } while (false);
91 
92  if (bKey) currentItem = &(data.makeItem(std::move(tokenStr)));
93  else {
94  if (!currentItem) {
95  throw InvalidFormat(
96  "values started without a key ('" + tokenStr + "' is not a valid key)."
97  );
98  }
99  currentItem->addValue(std::move(tokenStr));
100  }
101 
102  } // while
103 
104  if (forcedValues > 0) {
105  assert(currentItem);
106  throw MissingValues(currentItem->key(), forcedValues);
107  }
108 
109 } // icarus::KeyedCSVparser::parse()
static constexpr unsigned int FixedSize
Expected values are missing.
bool isKey(SubBuffer_t const &buffer) const noexcept
Is content of buffer a key (as opposed to a value)?
auto cbegin(FixedBins< T, C > const &) noexcept
Definition: FixedBins.h:567
auto cend(FixedBins< T, C > const &) noexcept
Definition: FixedBins.h:579
static constexpr unsigned int DynamicSize
Mnemonic size value used in addPattern() calls.
auto end(FixedBins< T, C > const &) noexcept
Definition: FixedBins.h:585
auto begin(FixedBins< T, C > const &) noexcept
Definition: FixedBins.h:573
std::vector< std::pair< std::regex, unsigned int > > fPatterns
List of known patterns for matching keys, and how many values they hold.
SubBuffer_t extractToken(Buffer_t &buffer) const noexcept
Extracts the next token from the buffer and returns its value, stripped.
then echo File list $list not found else cat $list while read file do echo $file sed s
Definition: file_to_url.sh:60
do i e
KeyValuesData::MissingSize MissingSize
Expected number of values is missing.
SubBuffer_t peekToken(Buffer_t const &buffer) const noexcept
Returns the value of the next token, stripped.
bool empty(FixedBins< T, C > const &) noexcept
Definition: FixedBins.h:555
template<typename BIter , typename EIter >
auto icarus::details::KeyedCSVparser::parse ( BIter  b,
EIter  e 
) const -> ParsedData_t

Definition at line 244 of file KeyedCSVparser.h.

246  { return parse(std::string_view{ &*b, std::distance(b, e) }); }
ParsedData_t parse(std::string_view const &s) const
Parses the buffer s and returns a data structure with the content.
double distance(geo::Point_t const &point, CathodeDesc_t const &cathode)
Returns the distance of a point from the cathode.
do i e
auto icarus::details::KeyedCSVparser::peekToken ( Buffer_t const &  buffer) const
privatenoexcept

Returns the value of the next token, stripped.

Definition at line 157 of file KeyedCSVparser.cxx.

158 {
159  return strip({ cbegin(buffer), findTokenLength(buffer) });
160 } // icarus::details::KeyedCSVparser::peekToken()
auto cbegin(FixedBins< T, C > const &) noexcept
Definition: FixedBins.h:567
std::size_t findTokenLength(Buffer_t const &buffer) const noexcept
Returns the length of the next toke, up to the next separator (excluded).
static SubBuffer_t strip(SubBuffer_t s) noexcept
auto icarus::details::KeyedCSVparser::strip ( SubBuffer_t  s)
staticprivatenoexcept

Definition at line 220 of file KeyedCSVparser.cxx.

222  { return stripRight(stripLeft(stripRightChars<'\n', '\r', '\0'>(s))); }
static SubBuffer_t stripRight(SubBuffer_t s) noexcept
static SubBuffer_t stripLeft(SubBuffer_t s) noexcept
then echo File list $list not found else cat $list while read file do echo $file sed s
Definition: file_to_url.sh:60
auto icarus::details::KeyedCSVparser::stripLeft ( SubBuffer_t  s)
staticprivatenoexcept

Definition at line 226 of file KeyedCSVparser.cxx.

228 {
229 
230  while (!s.empty()) {
231  if (!std::isspace(s.front())) break;
232  s.remove_prefix(1);
233  }
234  return s;
235 
236 } // icarus::details::KeyedCSVparser::stripLeft()
then echo File list $list not found else cat $list while read file do echo $file sed s
Definition: file_to_url.sh:60
auto icarus::details::KeyedCSVparser::stripRight ( SubBuffer_t  s)
staticprivatenoexcept

Definition at line 240 of file KeyedCSVparser.cxx.

242 {
243 
244  while (!s.empty()) {
245  if (!std::isspace(s.back())) break;
246  s.remove_suffix(1);
247  }
248  return s;
249 
250 } // icarus::details::KeyedCSVparser::stripRight()
then echo File list $list not found else cat $list while read file do echo $file sed s
Definition: file_to_url.sh:60
auto icarus::details::KeyedCSVparser::stripRightChar ( SubBuffer_t  s,
char  c 
)
staticprivatenoexcept

Definition at line 255 of file KeyedCSVparser.cxx.

256 {
257 
258  while (!s.empty()) {
259  if (s.back() != c) break;
260  s.remove_suffix(1);
261  }
262  return s;
263 
264 } // icarus::details::KeyedCSVparser::stripRightChar()
then echo File list $list not found else cat $list while read file do echo $file sed s
Definition: file_to_url.sh:60
template<char... Chars>
static SubBuffer_t icarus::details::KeyedCSVparser::stripRightChars ( SubBuffer_t  s)
staticprivatenoexcept
template<char... Chars>
auto icarus::details::KeyedCSVparser::stripRightChars ( SubBuffer_t  s) -> SubBuffer_t
noexcept

Definition at line 270 of file KeyedCSVparser.cxx.

271 {
272  while (true) {
273  auto ns = s;
274  for (char c: { Chars... }) ns = stripRightChar(ns, c);
275  if (ns == s) return ns;
276  s = ns;
277  } // while(true)
278 
279 } // icarus::details::KeyedCSVparser::stripRightChars()
static SubBuffer_t stripRightChar(SubBuffer_t s, char c) noexcept
then echo File list $list not found else cat $list while read file do echo $file sed s
Definition: file_to_url.sh:60

Member Data Documentation

constexpr unsigned int icarus::details::KeyedCSVparser::DynamicSize = FixedSize - 1U
static

Mnemonic size value used in addPattern() calls.

Definition at line 87 of file KeyedCSVparser.h.

constexpr unsigned int icarus::details::KeyedCSVparser::FixedSize = std::numeric_limits<unsigned int>::max()
static

Expected values are missing.

Mnemonic size value used in addPattern() calls.

Definition at line 85 of file KeyedCSVparser.h.

std::vector<std::pair<std::regex, unsigned int> > icarus::details::KeyedCSVparser::fPatterns
private

List of known patterns for matching keys, and how many values they hold.

Definition at line 170 of file KeyedCSVparser.h.

char const icarus::details::KeyedCSVparser::fSep = ','
private

Character used as token separator.

Definition at line 167 of file KeyedCSVparser.h.


The documentation for this class was generated from the following files: