From 8483c2ca02e59757208316d1da392441824b128f Mon Sep 17 00:00:00 2001
From: Wouter Scherphof
Date: Thu, 28 Mar 2013 13:22:57 +0100
Subject: [PATCH] added some links and descriptions
---
README.md | 63 +++++++++++++++++++++++++++++++------------------------
1 file changed, 36 insertions(+), 27 deletions(-)
diff --git a/README.md b/README.md
index 682cc3f..52ceca6 100644
--- a/README.md
+++ b/README.md
@@ -2,6 +2,9 @@
Parse HTML text into a tree of elements with selectors
+[1]: http://wscherphof.github.com/lua-set/
+[2]: http://api.jquery.com/category/selectors/
+
##License
MIT; see `./doc/LICENSE`
@@ -16,7 +19,7 @@ Then, parse some html:
local root = htmlparser.parse(htmlstring)
```
The input to parse may be the contents of a complete html document, or any valid html snippet, as long as all tags are correctly opened and closed.
-Now, find specific elements by selecting:
+Now, find sepcific contained elements by selecting:
```lua
local elements = root:select(selectorstring)
```
@@ -24,7 +27,7 @@ Or in shorthand:
```lua
local elements = root(selectorstring)
```
-This wil return a Set of elements, all of which are of the same type as the root element, and thus support selecting as well, if ever needed:
+This wil return a [Set][1] of elements, all of which are of the same type as the root element, and thus support selecting as well, if ever needed:
```lua
for e in pairs(elements) do
print(e.name)
@@ -34,19 +37,23 @@ for e in pairs(elements) do
end
end
```
+The root element is a container for the top level elements in the parsed text, i.e. the `` element in a parsed html document would be a child of the returned root element.
##Selectors
-- `"element"`
-- `"#id"`
-- `".class"`
-- `"[attribute]"`
-- `"[attribute=value]"`
-- `"[attribute!=value]"`
-- `"[attribute|=value]"`
-- `"[attribute*=value]"`
-- `"[attribute~=value]"`
-- `"[attribute^=value]"`
-- `"[attribute$=value]"`
+Supported selectors are a subset of [jQuery's selectors][2]:
+
+- `"*"` all contained elements
+- `"element"` elements with the given tagname
+- `"#id"` elements with the given id attribute value
+- `".class"` elements with the given classname in the class attribute
+- `"[attribute]"` elements with an attribute of the given name
+- `"[attribute='value']"` equals: elements with the given value for the attribute with the given name
+- `"[attribute!='value']"` not equals: elements without an attribute of the given name, or with that attribute, but with a value that is different from the given value
+- `"[attribute|='value']"` prefix: attribute's value is given value, or starts with given value, followed by a hyphen (`-`)
+- `"[attribute*='value']"` contains: attribute's value contains given value
+- `"[attribute~='value']"` word: attribute's value is a space-separated token, where one of the tokens is the given value
+- `"[attribute^='value']"` starts with: attribute's value starts with given value
+- `"[attribute$='value']"` ends with: attribute's value ends with given value
- `":not(selector)"`
- `"ancestor descendant"`
- `"parent > child"`
@@ -56,6 +63,8 @@ Selectors can be combined; e.g. `".class:not([attribute]) element.class"`
###Limitations
- Attribute values in selectors currently cannot contain any spaces, since space is interpreted as a delimiter between the `ancestor` and `descendant`, `parent` and `>`, or `>` and `child` parts of the selector
- Likewise, for the `parent > child` relation, the spaces before and after the `>` are mandatory
+- `line1
line2
` is plainly `"line1
line2"`
##Examples
See `./doc/samples.lua`
@@ -64,20 +73,20 @@ See `./doc/samples.lua`
All tree elements provide, apart from `:select` and `()`, the following accessors:
###Basic
-- `.name` = the element's tagname
-- `.attributes` = a table with keys and values for the element's attributes; `{}` if none
-- `.id` = the value of the element's id attribute; `nil` if not present
-- `.classes` = an array with the classes listed in element's class attribute; `{}` if none
-- `:getcontent()` = the raw text between the opening and closing tags of the element; `""` if none
-- `.nodes` = an array with the element's child elements, `{}` if none
-- `.parent` = the elements that contains this element; `root.parent` is `nil`
+- `.name` the element's tagname
+- `.attributes` a table with keys and values for the element's attributes; `{}` if none
+- `.id` the value of the element's id attribute; `nil` if not present
+- `.classes` an array with the classes listed in element's class attribute; `{}` if none
+- `:getcontent()` the raw text between the opening and closing tags of the element; `""` if none
+- `.nodes` an array with the element's child elements, `{}` if none
+- `.parent` the elements that contains this element; `root.parent` is `nil`
###Other
-- `:gettext()` = the raw text of the complete element, starting with `""`
-- `.level` = how deep the element is in the tree; root level is `0`
+- `:gettext()` the raw text of the complete element, starting with `""`
+- `.level` how deep the element is in the tree; root level is `0`
- `.root` the root element of the tree; `root.root` is `root`
-- `.deepernodes` = a Set containing all elements in the tree beneath this element, including this element's `.nodes`; `{}` if none
-- `.deeperelements` = a table with a key for each distinct tagname in `.deepernodes`, containing a Set of all deeper element nodes with that name; `{}` in none
-- `.deeperattributes` = as `.deeperelements`, but keyed on attribute name
-- `.deeperids` = as `.deeperelements`, but keyed on id value
-- `.deeperclasses` = as `.deeperelements`, but keyed on class name
+- `.deepernodes` a [Set][1] containing all elements in the tree beneath this element, including this element's `.nodes`; `{}` if none
+- `.deeperelements` a table with a key for each distinct tagname in `.deepernodes`, containing a [Set][1] of all deeper element nodes with that name; `{}` in none
+- `.deeperattributes` as `.deeperelements`, but keyed on attribute name
+- `.deeperids` as `.deeperelements`, but keyed on id value
+- `.deeperclasses` as `.deeperelements`, but keyed on class name