6 Commits

Author SHA1 Message Date
amery ff03ee922d lexer: introduce Error{}
Signed-off-by: Alejandro Mery <amery@jpi.io>
2023-08-29 02:41:48 +00:00
amery 868786cb9f lexer: introduce a Position (Line, Column) handler
Signed-off-by: Alejandro Mery <amery@jpi.io>
2023-08-29 02:10:46 +00:00
amery 3e964d1455 lexer: introduce StateFn and the basic state machine loop
Signed-off-by: Alejandro Mery <amery@jpi.io>
2023-08-29 02:10:03 +00:00
amery 530eff87e9 lexer: introduce Reader.Accept()/AcceptAll()
Signed-off-by: Alejandro Mery <amery@jpi.io>
2023-08-29 02:10:03 +00:00
amery c3339a2cdb build-sys: import build system from darvaza.org/core
Signed-off-by: Alejandro Mery <amery@jpi.io>
2023-08-29 02:03:53 +00:00
amery 5e3171d891 Merge branch 'pr-amery-reader' 2023-08-29 02:03:36 +00:00
5 changed files with 16 additions and 161 deletions
-77
View File
@@ -1,78 +1 @@
# asciigoat's core library
[![Go Reference][godoc-badge]][godoc]
[![Go Report Card][goreport-badge]][goreport]
This package contains the basics for writing simple parsers of
text languages heavily inspired by
[Rob Pike](https://en.wikipedia.org/wiki/Rob_Pike)'s talk on
[Lexical Scanning in Go](https://go.dev/talks/2011/lex.slide#1) in 2011 which
you can [watch online](https://www.youtube.com/watch?v=HxaD_trXwRE) to get
better understanding of the ideas behind **asciigoat**.
**asciigoat** is [MIT](https://opensource.org/license/mit/) licensed.
[godoc]: https://pkg.go.dev/asciigoat.org/core
[godoc-badge]: https://pkg.go.dev/badge/asciigoat.org/core.svg
[goreport]: https://goreportcard.com/report/asciigoat.org/core
[goreport-badge]: https://goreportcard.com/badge/asciigoat.org/core
[godoc-lexer-reader]: https://pkg.go.dev/asciigoat.org/core/lexer#Reader
[godoc-readcloser]: https://pkg.go.dev/asciigoat.org/core#ReadCloser
## Lexer
### lexer.Reader
The lexer package provides [`lexer.Reader`][godoc-lexer-reader] which is
actually an [`io.RuneScanner`](https://pkg.go.dev/io#RuneScanner)
that buffers accepted runes until you are ready to
[emit](https://pkg.go.dev/asciigoat.org/core/lexer#Reader.Emit) or
[discard](https://pkg.go.dev/asciigoat.org/core/lexer#Reader.Discard).
### lexer.Position
[`lexer.Position`](https://pkg.go.dev/asciigoat.org/core/lexer#Position)
is a `(Line, Column)` pair with methods to facilitate tracking
your position on the source [Reader](https://pkg.go.dev/io#Reader).
### lexer.Error
[`lexer.Error`](https://pkg.go.dev/asciigoat.org/core/lexer#Error)
is an [unwrappable](https://pkg.go.dev/errors#Unwrap) error with a
token position and hint attached.
### lexer.StateFn
At the heart of **asciigoat** we have _state functions_ as proposed on [Rob Pike's famous talk](https://www.youtube.com/watch?v=HxaD_trXwRE) which return the next _state function_ parsing is done.
Additionally there is a [`Run()`](https://pkg.go.dev/asciigoat.org/lexer#Run) helper that implements the loop.
### rune checkers
_Rune checkers_ are simple functions that tell if a rune is of a class or it's not.
Fundamental checkers are provided by the [`unicode` package](https://pkg.go.dev/unicode).
Our [`lexer.Reader`][godoc-lexer-reader] uses them on its `Accept()` and `AcceptAll()` methods to
make it easier to consume the _source_ document.
To facilitate the declaration of _rune classes_ in the context of **asciigoat** powered parsers we include
a series of rune checker factories.
* `NewIsIn(string)`
* `NewIsInRunes(...rune)`
* `NewIsNot(checker)`
* `NewIsOneOf(...checker)`
## Others
### ReadCloser
[ReadCloser][godoc-readcloser] assists in providing a
[io.Closer](https://pkg.go.dev/io#Closer) to Readers or buffers without on,
or unearthing one if available so
[io.ReadCloser](https://pkg.go.dev/io#ReadCloser) can be fulfilled.
## See also
* [asciigoat.org/ini](https://asciigoat.org/ini)
* [oss.jpi.io](https://oss.jpi.io)
-9
View File
@@ -1,7 +1,6 @@
package lexer
import (
"errors"
"fmt"
"strings"
)
@@ -10,14 +9,6 @@ var (
_ error = (*Error)(nil)
)
var (
// ErrUnacceptableRune indicates the read rune isn't acceptable in the context
ErrUnacceptableRune = errors.New("rune not acceptable in context")
// ErrNotImplemented indicates something hasn't been implemented yet
ErrNotImplemented = errors.New("not implemented")
)
// Error represents a generic parsing error
type Error struct {
Filename string
+3 -17
View File
@@ -1,31 +1,17 @@
// Package lexer provides basic helpers to implement parsers
package lexer
import (
"errors"
"io"
)
// StateFn is a State Function of the parser
type StateFn func() (StateFn, error)
// Run runs a state machine until the state function either
// returns nil or an error
func Run(fn StateFn) error {
for fn != nil {
var err error
var err error
for fn != nil && err == nil {
fn, err = fn()
switch {
case errors.Is(err, io.EOF):
// EOF
return nil
case err != nil:
// failed
return err
}
}
// ended
return nil
return err
}
+13 -11
View File
@@ -41,26 +41,28 @@ func (p *Position) Step() {
p.Column++
}
// StepN moves the column N places forward
func (p *Position) StepN(n int) {
// Next returns a new Position one rune forward
// on the line
func (p Position) Next() Position {
if p.Line == 0 {
p.Reset()
}
switch {
case n > 0:
p.Column += n
default:
panic(fmt.Errorf("invalid %v increment", n))
return Position{
Line: p.Line,
Column: p.Column + 1,
}
}
// StepLine moves position to the start of the next line
func (p *Position) StepLine() {
// NextLine returns a new Position at the begining of the next
// line.
func (p Position) NextLine() Position {
if p.Line == 0 {
p.Reset()
}
p.Line++
p.Column = 1
return Position{
Line: p.Line + 1,
Column: 1,
}
}
-47
View File
@@ -1,47 +0,0 @@
package lexer
import (
"strings"
"unicode"
)
// NewIsNot generates a rune condition checker that reverses the
// decision of the given checker.
func NewIsNot(cond func(rune) bool) func(rune) bool {
return func(r rune) bool {
return !cond(r)
}
}
// NewIsIn generates a rune condition checker that accepts runes
// contained on the provided string
func NewIsIn(s string) func(rune) bool {
return func(r rune) bool {
return strings.ContainsRune(s, r)
}
}
// NewIsInRunes generates a rune condition checker that accepts
// the runes specified
func NewIsInRunes(s ...rune) func(rune) bool {
return NewIsIn(string(s))
}
// NewIsOneOf generates a run condition checker that accepts runes
// accepted by any of the given checkers
func NewIsOneOf(s ...func(rune) bool) func(rune) bool {
return func(r rune) bool {
for _, cond := range s {
if cond(r) {
return true
}
}
return false
}
}
// IsSpace reports whether the rune is a space character as
// defined by Unicode's White Space property
func IsSpace(r rune) bool {
return unicode.IsSpace(r)
}