diff --git a/index.html b/index.html index e01193a..0238855 100644 --- a/index.html +++ b/index.html @@ -39,7 +39,7 @@
Htmlparser depends on Lua 5.2, and on the ["set"][1] LuaRock, which is installed along automatically. To be able to run the tests, lunitx also comes along as a LuaRock
+Htmlparser depends on Lua 5.2, and on the "set" LuaRock, which is installed along automatically. To be able to run the tests, lunitx also comes along as a LuaRock
local elements = root(selectorstring)
This wil return a [Set][1] of elements, all of which are of the same type as the root element, and thus support selecting as well, if ever needed:
+This wil return a list of elements, all of which are of the same type as the root element, and thus support selecting as well, if ever needed:
-for e in pairs(elements) do
+for _,e in ipairs(elements) do
print(e.name)
local subs = e(subselectorstring)
- for sub in pairs(subs) do
+ for _,sub in ipairs(subs) do
print("", sub.name)
end
end
@@ -82,7 +82,7 @@ Now, find specific contained elements by selecting:
Selectors
-Supported selectors are a subset of [jQuery's selectors][2]:
+Supported selectors are a subset of jQuery's selectors:
-
@@ -146,6 +146,9 @@ Now, find specific contained elements by selecting:
-
+
.index
sequence number of elements in order of appearance; root index is 0
+
+-
:gettext()
the complete element text, starting with "<tagname"
and ending with "/>"
or "</tagname>"
-
@@ -155,9 +158,9 @@ Now, find specific contained elements by selecting:
.root
the root element of the tree; root.root
is root
-
-
.deepernodes
a [Set][1] containing all elements in the tree beneath this element, including this element's .nodes
; {}
if none
+.deepernodes
a Set containing all elements in the tree beneath this element, including this element's .nodes
; {}
if none
-
-
.deeperelements
a table with a key for each distinct tagname in .deepernodes
, containing a [Set][1] of all deeper element nodes with that name; {}
in none
+.deeperelements
a table with a key for each distinct tagname in .deepernodes
, containing a Set of all deeper element nodes with that name; {}
in none
-
.deeperattributes
as .deeperelements
, but keyed on attribute name
-
diff --git a/params.json b/params.json
index d5e3577..94ae28e 100644
--- a/params.json
+++ b/params.json
@@ -1 +1 @@
-{"name":"LuaRock \"htmlparser\"","tagline":"Parse HTML text into a tree of elements with selectors","body":"##Install\r\nHtmlparser is a listed [LuaRock](http://luarocks.org/repositories/rocks/). Install using [LuaRocks](http://www.luarocks.org/): `luarocks install htmlparser`\r\n\r\n###Dependencies\r\nHtmlparser depends on [Lua 5.2](http://www.lua.org/download.html), and on the [\"set\"][1] LuaRock, which is installed along automatically. To be able to run the tests, [lunitx](https://github.com/dcurrie/lunit) also comes along as a LuaRock\r\n\r\n##Usage\r\nStart off with\r\n```lua\r\nrequire(\"luarocks.loader\")\r\nlocal htmlparser = require(\"htmlparser\")\r\n```\r\nThen, parse some html:\r\n```lua\r\nlocal root = htmlparser.parse(htmlstring)\r\n```\r\nThe input to parse may be the contents of a complete html document, or any valid html snippet, as long as all tags are correctly opened and closed.\r\nNow, find specific contained elements by selecting:\r\n```lua\r\nlocal elements = root:select(selectorstring)\r\n```\r\nOr in shorthand:\r\n```lua\r\nlocal elements = root(selectorstring)\r\n```\r\nThis wil return a [Set][1] of elements, all of which are of the same type as the root element, and thus support selecting as well, if ever needed:\r\n```lua\r\nfor e in pairs(elements) do\r\n\tprint(e.name)\r\n\tlocal subs = e(subselectorstring)\r\n\tfor sub in pairs(subs) do\r\n\t\tprint(\"\", sub.name)\r\n\tend\r\nend\r\n```\r\nThe root element is a container for the top level elements in the parsed text, i.e. the `` element in a parsed html document would be a child of the returned root element.\r\n\r\n##Selectors\r\nSupported selectors are a subset of [jQuery's selectors][2]:\r\n\r\n- `\"*\"` all contained elements\r\n- `\"element\"` elements with the given tagname\r\n- `\"#id\"` elements with the given id attribute value\r\n- `\".class\"` elements with the given classname in the class attribute\r\n- `\"[attribute]\"` elements with an attribute of the given name\r\n- `\"[attribute='value']\"` equals: elements with the given value for the given attribute\r\n- `\"[attribute!='value']\"` not equals: elements without the given attribute, or having the attribute, but with a different value\r\n- `\"[attribute|='value']\"` prefix: attribute's value is given value, or starts with given value, followed by a hyphen (`-`)\r\n- `\"[attribute*='value']\"` contains: attribute's value contains given value\r\n- `\"[attribute~='value']\"` word: attribute's value is a space-separated token, where one of the tokens is the given value\r\n- `\"[attribute^='value']\"` starts with: attribute's value starts with given value\r\n- `\"[attribute$='value']\"` ends with: attribute's value ends with given value\r\n- `\":not(selectorstring)\"` elements not selected by given selector string\r\n- `\"ancestor descendant\"` elements selected by the `descendant` selector string, that are a descendant of any element selected by the `ancestor` selector string\r\n- `\"parent > child\"` elements selected by the `child` selector string, that are a child element of any element selected by the `parent` selector string\r\n\r\nSelectors can be combined; e.g. `\".class:not([attribute]) element.class\"`\r\n\r\n##Element type\r\nAll tree elements provide, apart from `:select` and `()`, the following accessors:\r\n\r\n###Basic\r\n- `.name` the element's tagname\r\n- `.attributes` a table with keys and values for the element's attributes; `{}` if none\r\n- `.id` the value of the element's id attribute; `nil` if not present\r\n- `.classes` an array with the classes listed in element's class attribute; `{}` if none\r\n- `:getcontent()` the raw text between the opening and closing tags of the element; `\"\"` if none\r\n- `.nodes` an array with the element's child elements, `{}` if none\r\n- `.parent` the elements that contains this element; `root.parent` is `nil`\r\n\r\n###Other\r\n- `:gettext()` the complete element text, starting with `\"
\"` or `\"\"`\r\n- `.level` how deep the element is in the tree; root level is `0`\r\n- `.root` the root element of the tree; `root.root` is `root`\r\n- `.deepernodes` a [Set][1] containing all elements in the tree beneath this element, including this element's `.nodes`; `{}` if none\r\n- `.deeperelements` a table with a key for each distinct tagname in `.deepernodes`, containing a [Set][1] of all deeper element nodes with that name; `{}` in none\r\n- `.deeperattributes` as `.deeperelements`, but keyed on attribute name\r\n- `.deeperids` as `.deeperelements`, but keyed on id value\r\n- `.deeperclasses` as `.deeperelements`, but keyed on class name\r\n\r\n##Limitations\r\n- Attribute values in selector strings cannot contain any spaces, nor any of `#`, `.`, `[`, `]`, `:`, `(`, or `)`\r\n- The spaces before and after the `>` in a `parent > child` relation are mandatory \r\n- `line1
line2\")`, `root.nodes[1]:getcontent()` is `\"line1
line2\"`, while `root.nodes[1].nodes[1].name` is `\"br\"`\r\n- No start or end tags are implied when [omitted](http://www.w3.org/TR/html5/syntax.html#optional-tags). Only the [void elements](http://www.w3.org/TR/html5/syntax.html#void-elements) should not have an end tag\r\n- No validation is done for tag or attribute names or nesting of element types. The list of void elements is in fact the only part specific to HTML\r\n\r\n##Examples\r\nSee `./doc/sample.lua`\r\n\r\n##Tests\r\nSee `./tst/init.lua`\r\n\r\n##License\r\nLGPL+; see `./doc/LICENSE`\r\n","google":"","note":"Don't delete this file! It's used internally to help with page regeneration."}
\ No newline at end of file
+{"name":"LuaRock \"htmlparser\"","tagline":"Parse HTML text into a tree of elements with selectors","body":"[1]: http://wscherphof.github.com/lua-set/\r\n[2]: http://api.jquery.com/category/selectors/\r\n\r\n##Install\r\nHtmlparser is a listed [LuaRock](http://luarocks.org/repositories/rocks/). Install using [LuaRocks](http://www.luarocks.org/): `luarocks install htmlparser`\r\n\r\n###Dependencies\r\nHtmlparser depends on [Lua 5.2](http://www.lua.org/download.html), and on the [\"set\"][1] LuaRock, which is installed along automatically. To be able to run the tests, [lunitx](https://github.com/dcurrie/lunit) also comes along as a LuaRock\r\n\r\n##Usage\r\nStart off with\r\n```lua\r\nrequire(\"luarocks.loader\")\r\nlocal htmlparser = require(\"htmlparser\")\r\n```\r\nThen, parse some html:\r\n```lua\r\nlocal root = htmlparser.parse(htmlstring)\r\n```\r\nThe input to parse may be the contents of a complete html document, or any valid html snippet, as long as all tags are correctly opened and closed.\r\nNow, find specific contained elements by selecting:\r\n```lua\r\nlocal elements = root:select(selectorstring)\r\n```\r\nOr in shorthand:\r\n```lua\r\nlocal elements = root(selectorstring)\r\n```\r\nThis wil return a list of elements, all of which are of the same type as the root element, and thus support selecting as well, if ever needed:\r\n```lua\r\nfor _,e in ipairs(elements) do\r\n\tprint(e.name)\r\n\tlocal subs = e(subselectorstring)\r\n\tfor _,sub in ipairs(subs) do\r\n\t\tprint(\"\", sub.name)\r\n\tend\r\nend\r\n```\r\nThe root element is a container for the top level elements in the parsed text, i.e. the `` element in a parsed html document would be a child of the returned root element.\r\n\r\n##Selectors\r\nSupported selectors are a subset of [jQuery's selectors][2]:\r\n\r\n- `\"*\"` all contained elements\r\n- `\"element\"` elements with the given tagname\r\n- `\"#id\"` elements with the given id attribute value\r\n- `\".class\"` elements with the given classname in the class attribute\r\n- `\"[attribute]\"` elements with an attribute of the given name\r\n- `\"[attribute='value']\"` equals: elements with the given value for the given attribute\r\n- `\"[attribute!='value']\"` not equals: elements without the given attribute, or having the attribute, but with a different value\r\n- `\"[attribute|='value']\"` prefix: attribute's value is given value, or starts with given value, followed by a hyphen (`-`)\r\n- `\"[attribute*='value']\"` contains: attribute's value contains given value\r\n- `\"[attribute~='value']\"` word: attribute's value is a space-separated token, where one of the tokens is the given value\r\n- `\"[attribute^='value']\"` starts with: attribute's value starts with given value\r\n- `\"[attribute$='value']\"` ends with: attribute's value ends with given value\r\n- `\":not(selectorstring)\"` elements not selected by given selector string\r\n- `\"ancestor descendant\"` elements selected by the `descendant` selector string, that are a descendant of any element selected by the `ancestor` selector string\r\n- `\"parent > child\"` elements selected by the `child` selector string, that are a child element of any element selected by the `parent` selector string\r\n\r\nSelectors can be combined; e.g. `\".class:not([attribute]) element.class\"`\r\n\r\n##Element type\r\nAll tree elements provide, apart from `:select` and `()`, the following accessors:\r\n\r\n###Basic\r\n- `.name` the element's tagname\r\n- `.attributes` a table with keys and values for the element's attributes; `{}` if none\r\n- `.id` the value of the element's id attribute; `nil` if not present\r\n- `.classes` an array with the classes listed in element's class attribute; `{}` if none\r\n- `:getcontent()` the raw text between the opening and closing tags of the element; `\"\"` if none\r\n- `.nodes` an array with the element's child elements, `{}` if none\r\n- `.parent` the elements that contains this element; `root.parent` is `nil`\r\n\r\n###Other\r\n- `.index` sequence number of elements in order of appearance; root index is `0`\r\n- `:gettext()` the complete element text, starting with `\" \"` or `\"\"`\r\n- `.level` how deep the element is in the tree; root level is `0`\r\n- `.root` the root element of the tree; `root.root` is `root`\r\n- `.deepernodes` a [Set][1] containing all elements in the tree beneath this element, including this element's `.nodes`; `{}` if none\r\n- `.deeperelements` a table with a key for each distinct tagname in `.deepernodes`, containing a [Set][1] of all deeper element nodes with that name; `{}` in none\r\n- `.deeperattributes` as `.deeperelements`, but keyed on attribute name\r\n- `.deeperids` as `.deeperelements`, but keyed on id value\r\n- `.deeperclasses` as `.deeperelements`, but keyed on class name\r\n\r\n##Limitations\r\n- Attribute values in selector strings cannot contain any spaces, nor any of `#`, `.`, `[`, `]`, `:`, `(`, or `)`\r\n- The spaces before and after the `>` in a `parent > child` relation are mandatory \r\n- `line1
line2\")`, `root.nodes[1]:getcontent()` is `\"line1
line2\"`, while `root.nodes[1].nodes[1].name` is `\"br\"`\r\n- No start or end tags are implied when [omitted](http://www.w3.org/TR/html5/syntax.html#optional-tags). Only the [void elements](http://www.w3.org/TR/html5/syntax.html#void-elements) should not have an end tag\r\n- No validation is done for tag or attribute names or nesting of element types. The list of void elements is in fact the only part specific to HTML\r\n\r\n##Examples\r\nSee `./doc/sample.lua`\r\n\r\n##Tests\r\nSee `./tst/init.lua`\r\n\r\n##License\r\nLGPL+; see `./doc/LICENSE`\r\n","google":"","note":"Don't delete this file! It's used internally to help with page regeneration."}
\ No newline at end of file