mirror of
https://github.com/msva/lua-htmlparser.git
synced 2024-11-27 12:44:22 +00:00
fix format README.md (#46)
This commit is contained in:
parent
9dc0715503
commit
b6b01c29e7
24
README.md
24
README.md
@ -1,17 +1,17 @@
|
|||||||
#LuaRock "htmlparser"
|
# LuaRock "htmlparser"
|
||||||
|
|
||||||
Parse HTML text into a tree of elements with selectors
|
Parse HTML text into a tree of elements with selectors
|
||||||
|
|
||||||
[1]: http://wscherphof.github.com/lua-set/
|
[1]: http://wscherphof.github.com/lua-set/
|
||||||
[2]: http://api.jquery.com/category/selectors/
|
[2]: http://api.jquery.com/category/selectors/
|
||||||
|
|
||||||
##Install
|
## Install
|
||||||
Htmlparser is a listed [LuaRock](http://luarocks.org/repositories/rocks/). Install using [LuaRocks](http://www.luarocks.org/): `luarocks install htmlparser`
|
Htmlparser is a listed [LuaRock](http://luarocks.org/repositories/rocks/). Install using [LuaRocks](http://www.luarocks.org/): `luarocks install htmlparser`
|
||||||
|
|
||||||
###Dependencies
|
### Dependencies
|
||||||
Htmlparser depends on [Lua 5.2](http://www.lua.org/download.html) (while work with LuaJIT, which provides 5.1-compatible ABI), and on the ["lua-set"][1] package, which is installed along automatically. To be able to run the tests, [lunitx](https://github.com/dcurrie/lunit) also comes along as a LuaRock
|
Htmlparser depends on [Lua 5.2](http://www.lua.org/download.html) (while work with LuaJIT, which provides 5.1-compatible ABI), and on the ["lua-set"][1] package, which is installed along automatically. To be able to run the tests, [lunitx](https://github.com/dcurrie/lunit) also comes along as a LuaRock
|
||||||
|
|
||||||
##Usage
|
## Usage
|
||||||
Start off with
|
Start off with
|
||||||
```lua
|
```lua
|
||||||
pcall(require, "luarocks.loader")
|
pcall(require, "luarocks.loader")
|
||||||
@ -42,7 +42,7 @@ end
|
|||||||
```
|
```
|
||||||
The root element is a container for the top level elements in the parsed text, i.e. the `<html>` element in a parsed html document would be a child of the returned root element.
|
The root element is a container for the top level elements in the parsed text, i.e. the `<html>` element in a parsed html document would be a child of the returned root element.
|
||||||
|
|
||||||
##Selectors
|
## Selectors
|
||||||
Supported selectors are a subset of [jQuery's selectors][2]:
|
Supported selectors are a subset of [jQuery's selectors][2]:
|
||||||
|
|
||||||
- `"*"` all contained elements
|
- `"*"` all contained elements
|
||||||
@ -63,10 +63,10 @@ Supported selectors are a subset of [jQuery's selectors][2]:
|
|||||||
|
|
||||||
Selectors can be combined; e.g. `".class:not([attribute]) element.class"`
|
Selectors can be combined; e.g. `".class:not([attribute]) element.class"`
|
||||||
|
|
||||||
##Element type
|
## Element type
|
||||||
All tree elements provide, apart from `:select` and `()`, the following accessors:
|
All tree elements provide, apart from `:select` and `()`, the following accessors:
|
||||||
|
|
||||||
###Basic
|
### Basic
|
||||||
- `.name` the element's tagname
|
- `.name` the element's tagname
|
||||||
- `.attributes` a table with keys and values for the element's attributes; `{}` if none
|
- `.attributes` a table with keys and values for the element's attributes; `{}` if none
|
||||||
- `.id` the value of the element's id attribute; `nil` if not present
|
- `.id` the value of the element's id attribute; `nil` if not present
|
||||||
@ -75,7 +75,7 @@ All tree elements provide, apart from `:select` and `()`, the following accessor
|
|||||||
- `.nodes` an array with the element's child elements, `{}` if none
|
- `.nodes` an array with the element's child elements, `{}` if none
|
||||||
- `.parent` the element that contains this element; `root.parent` is `nil`
|
- `.parent` the element that contains this element; `root.parent` is `nil`
|
||||||
|
|
||||||
###Other
|
### Other
|
||||||
- `.index` sequence number of elements in order of appearance; root index is `0`
|
- `.index` sequence number of elements in order of appearance; root index is `0`
|
||||||
- `:gettext()` the complete element text, starting with `"<tagname"` and ending with `"/>"` or `"</tagname>"`
|
- `:gettext()` the complete element text, starting with `"<tagname"` and ending with `"/>"` or `"</tagname>"`
|
||||||
- `.level` how deep the element is in the tree; root level is `0`
|
- `.level` how deep the element is in the tree; root level is `0`
|
||||||
@ -86,7 +86,7 @@ All tree elements provide, apart from `:select` and `()`, the following accessor
|
|||||||
- `.deeperids` as `.deeperelements`, but keyed on id value
|
- `.deeperids` as `.deeperelements`, but keyed on id value
|
||||||
- `.deeperclasses` as `.deeperelements`, but keyed on class name
|
- `.deeperclasses` as `.deeperelements`, but keyed on class name
|
||||||
|
|
||||||
##Limitations
|
## Limitations
|
||||||
- Attribute values in selector strings cannot contain any spaces
|
- Attribute values in selector strings cannot contain any spaces
|
||||||
- The spaces before and after the `>` in a `parent > child` relation are mandatory
|
- The spaces before and after the `>` in a `parent > child` relation are mandatory
|
||||||
- `<!` elements (including doctype, comments, and CDATA) are not parsed; markup within CDATA is *not* escaped
|
- `<!` elements (including doctype, comments, and CDATA) are not parsed; markup within CDATA is *not* escaped
|
||||||
@ -94,11 +94,11 @@ All tree elements provide, apart from `:select` and `()`, the following accessor
|
|||||||
- No start or end tags are implied when [omitted](http://www.w3.org/TR/html5/syntax.html#optional-tags). Only the [void elements](http://www.w3.org/TR/html5/syntax.html#void-elements) should not have an end tag
|
- No start or end tags are implied when [omitted](http://www.w3.org/TR/html5/syntax.html#optional-tags). Only the [void elements](http://www.w3.org/TR/html5/syntax.html#void-elements) should not have an end tag
|
||||||
- No validation is done for tag or attribute names or nesting of element types. The list of void elements is in fact the only part specific to HTML
|
- No validation is done for tag or attribute names or nesting of element types. The list of void elements is in fact the only part specific to HTML
|
||||||
|
|
||||||
##Examples
|
## Examples
|
||||||
See `./doc/sample.lua`
|
See `./doc/sample.lua`
|
||||||
|
|
||||||
##Tests
|
## Tests
|
||||||
See `./tst/init.lua`
|
See `./tst/init.lua`
|
||||||
|
|
||||||
##License
|
## License
|
||||||
LGPL+; see `./doc/LICENSE`
|
LGPL+; see `./doc/LICENSE`
|
||||||
|
Loading…
Reference in New Issue
Block a user