Module ScopedSearch::QueryLanguage::Tokenizer
In: lib/scoped_search/query_language/tokenizer.rb

The Tokenizer module adds methods to the query language compiler that transforms a query string into a stream of tokens, which are more appropriate for parsing a query string.

Methods

Constants

KEYWORDS = { 'and' => :and, 'or' => :or, 'not' => :not, 'set?' => :notnull, 'has' => :notnull, 'null?' => :null, 'before' => :lt, 'after' => :gt, 'at' => :eq }   All keywords that the language supports
OPERATORS = { '&' => :and, '|' => :or, '&&' => :and, '||' => :or, '-'=> :not, '!' => :not, '~' => :like, '!~' => :unlike, '=' => :eq, '==' => :eq, '!=' => :ne, '<>' => :ne, '>' => :gt, '<' => :lt, '>=' => :gte, '<=' => :lte }   Every operator the language supports.

Public Instance methods

Returns the current character of the string

[Source]

    # File lib/scoped_search/query_language/tokenizer.rb, line 19
19:   def current_char
20:     @current_char
21:   end
each(&block)

Alias for each_token

Tokenizes the string by iterating over the characters.

[Source]

    # File lib/scoped_search/query_language/tokenizer.rb, line 37
37:   def each_token(&block)
38:     while next_char
39:       case current_char
40:       when /^\s?$/; # ignore
41:       when '(';  yield(:lparen)
42:       when ')';  yield(:rparen)
43:       when ',';  yield(:comma)
44:       when /\&|\||=|<|>|!|~|-/;  tokenize_operator(&block)
45:       when '"';                  tokenize_quoted_keyword(&block)
46:       else;                      tokenize_keyword(&block)
47:       end
48:     end
49:   end

Returns the next character of the string, and moves the position pointer one step forward

[Source]

    # File lib/scoped_search/query_language/tokenizer.rb, line 31
31:   def next_char
32:     @current_char_pos += 1
33:     @current_char = @str[@current_char_pos, 1]
34:   end

Returns a following character of the string (by default, the next character), without updating the position pointer.

[Source]

    # File lib/scoped_search/query_language/tokenizer.rb, line 25
25:   def peek_char(amount = 1)
26:     @str[@current_char_pos + amount, 1]
27:   end

Tokenizes the string and returns the result as an array of tokens.

[Source]

    # File lib/scoped_search/query_language/tokenizer.rb, line 13
13:   def tokenize
14:     @current_char_pos = -1
15:     to_a
16:   end

Tokenizes a keyword, and converts it to a Symbol if it is recognized as a reserved language keyword (the KEYWORDS array).

[Source]

    # File lib/scoped_search/query_language/tokenizer.rb, line 60
60:   def tokenize_keyword(&block)
61:     keyword = current_char
62:     keyword << next_char while /[^=~<>\s\&\|\)\(,]/ =~ peek_char
63:     KEYWORDS.has_key?(keyword.downcase) ? yield(KEYWORDS[keyword.downcase]) : yield(keyword)
64:   end

Tokenizes an operator that occurs in the OPERATORS hash

[Source]

    # File lib/scoped_search/query_language/tokenizer.rb, line 52
52:   def tokenize_operator(&block)
53:     operator = current_char
54:     operator << next_char if OPERATORS.has_key?(operator + peek_char)
55:     yield(OPERATORS[operator])
56:   end

Tokenizes a keyword that is quoted using double quotes. Allows escaping of double quote characters by backslashes.

[Source]

    # File lib/scoped_search/query_language/tokenizer.rb, line 68
68:   def tokenize_quoted_keyword(&block)
69:     keyword = ""
70:     until next_char.nil? || current_char == '"'
71:       keyword << (current_char == "\\" ? next_char : current_char)
72:     end
73:     yield(keyword)
74:   end

[Validate]