Commit Graph

34 Commits

Author SHA1 Message Date
Geoff Leyland
22d84c44ee tidy BOM handling and add a test 2014-07-21 11:15:00 +12:00
Geoff Leyland
54a7bb2221 Whoops! Trying to find a unicode BOM at the start of a file with an anchored pattern when file_buffer:find doesn't understand anchoring lead to reading the whole file and running out of memory right at the start. Use :sub to check the first few characters of the file 2014-06-19 17:52:44 +12:00
Geoff Leyland
1edff5f4ef Document after that last commit making it more useful 2014-06-19 16:24:04 +12:00
Geoff Leyland
3b28558461 if use() is passed a file, uses that. If it's passed nil as the buffer, it uses stdin 2014-06-19 16:13:38 +12:00
Geoff Leyland
8db3740230 handle blank lines better: check whether they're blank before the transform runs. Also allow blank lines before the header 2014-06-17 15:22:58 +12:00
Geoff Leyland
8eb59774ee whoops, there was a doubled-up unprotected call to column_map.transform in there 2014-06-17 15:10:37 +12:00
Geoff Leyland
0ebafce1f7 normalise more title characters to spaces in column_map:new - in fact the same ones as in column_map:read_header 2014-06-17 14:17:15 +12:00
Geoff Leyland
ec99b22fae skip UTF-8 or 16 BOMs if you find one. There's other BOMs but this should do for now 2014-06-09 17:58:38 +12:00
Geoff Leyland
70c5dd6f9b csv.lua: move error handling up, again, trying to make the parser a bit shorter. Not really working, but giving it a go all the same 2014-05-27 09:48:01 +12:00
Geoff Leyland
fd9d21cb9c csv.lua: make the column map managing stuff its own class to try to make separated_values_iterator cleaner. hmmm 2014-05-27 09:28:54 +12:00
Geoff Leyland
c4f21c0264 csv.lua: move initialisation code out of separated_values_iterator to try to make it a bit easier to see what's going on 2014-05-27 08:50:51 +12:00
Geoff Leyland
8493881362 update license and make the default buffer block size a megabyte rather than a measly 4k 2014-05-26 21:38:32 +12:00
Geoff Leyland
be4420ae62 csv.lua: now that a buffer looks like a string, reading strings is easier 2014-05-26 21:21:14 +12:00
Geoff Leyland
3d3bbfb6c1 csv.lua: separate the buffering stuff out into a file_buffer class (that looks strangely like a string...) 2014-05-26 21:03:54 +12:00
Geoff Leyland
d9d9e419c7 csv.lua: track the buffer start with a variable called (surprisingly) buffer_start, and now field_start and line_start count from the start of the whole file, not the current state of the buffer 2014-05-26 18:17:54 +12:00
Geoff Leyland
e2ea3d2f1a csv.lua: rename anchor_pos to field_start 2014-05-26 17:51:54 +12:00
Geoff Leyland
0daa2ae5d7 csv.lua: move find's offsetting into field_find 2014-05-26 17:49:23 +12:00
Geoff Leyland
6892667042 csv.lua: move the offset by anchor_pos out of sub and into field_sub 2014-05-26 17:46:17 +12:00
Geoff Leyland
46e65775bf whoops. Fix up find's return values when we've hit the end of file. How did that work? 2014-05-26 17:38:58 +12:00
Geoff Leyland
e18409d73f csv.lua: rename find's second argument 'init' to match the Lua Reference Manual 2014-05-26 17:27:50 +12:00
Geoff Leyland
966ba6722f csv.lua: tidy up truncating the buffer (not really the right word, since we're cutting off the beginning) and advancing anchor_pos (also a bad name) 2014-05-26 17:26:24 +12:00
Geoff Leyland
88e30b6720 Test at all buffer sizes from 1 to 16. Fix the resulting bug when an embedded quote straddles two buffer blocks 2014-05-26 17:20:17 +12:00
Geoff Leyland
13e3b69c74 csv.lua: this all doesn't work with small buffer sizes. Whoops. Made tests at small buffer sizes and fixed the problem 2014-05-26 17:06:28 +12:00
Geoff Leyland
c40e12bb7c test.lua: add an exclamation mark to the end of newlines so that they're explicit 2014-05-26 16:44:57 +12:00
Geoff Leyland
deac119c13 csv.lua: move some variable definitions closer to their use (the aim here is to move find and sub out into a buffer object) 2014-05-26 16:21:45 +12:00
Geoff Leyland
e0123ee133 csv.lua: fix typos 2014-05-26 16:17:38 +12:00
Kevin Martin
f28dfe0720 Added an openstring function, by wrapping the string in an object
that supports read(bytes) and passing the object to the underlying
code. Modified all the tests to run with both open and openstring.

Changed the default buffer size in csv.lua to match the value in the
README, and added the crucial blank line to test-data/blankline.csv
2014-05-18 18:52:16 +01:00
Geoff Leyland
c9988c8b93 handle blank lines more correctly 2014-01-29 13:28:47 +13:00
Geoff Leyland
ac033b3075 ignore blank lines (I'm not sure this is a good thing *in* the file, but ignoring blank lines at the end is a good idea) 2013-12-06 12:56:02 +13:00
Geoff Leyland
279ca0717d added tests for embedded quotes and reading files with headers 2013-12-05 21:37:25 +13:00
Geoff Leyland
50e14b7484 Added one test, for embedded newlines, and fixed all the bugs it found 2013-12-05 21:20:10 +13:00
Geoff Leyland
7294a1bc72 fix up accidental global variables. Thanks on Ashwyn Hirschi 2013-12-05 15:01:41 +13:00
Geoff Leyland
daa18891f3 get the read size right in extend. Thanks to xxopxe@gmail.com 2013-12-05 14:46:38 +13:00
Geoff Leyland
da57f60673 first commit of lua-csv 2013-12-04 22:16:11 +13:00