Up | TOC | Index | |||||
<< 7 Included Tools | < 8.52 Syntax Coloring | Up: 8 API Reference | 8.54 Unit Testing Framework > | 9 Release Notes >> |
8.53 Euphoria Source Tokenizer
8.53.1 tokenize return sequence key
8.53.1.1 ET_TOKENS
include euphoria/tokenize.e namespace tokenize public enum ET_TOKENS
8.53.1.2 ET_ERROR
include euphoria/tokenize.e namespace tokenize public enum ET_ERROR
8.53.1.3 ET_ERR_LINE
include euphoria/tokenize.e namespace tokenize public enum ET_ERR_LINE
8.53.1.4 ET_ERR_COLUMN
include euphoria/tokenize.e namespace tokenize public enum ET_ERR_COLUMN
8.53.2 Tokens
8.53.2.1 T_EOF
include euphoria/tokenize.e namespace tokenize public enum T_EOF
8.53.2.2 T_NULL
include euphoria/tokenize.e namespace tokenize public enum T_NULL
8.53.2.3 T_SHBANG
include euphoria/tokenize.e namespace tokenize public enum T_SHBANG
8.53.2.4 T_NEWLINE
include euphoria/tokenize.e namespace tokenize public enum T_NEWLINE
8.53.2.5 T_COMMENT
include euphoria/tokenize.e namespace tokenize public enum T_COMMENT
8.53.2.6 T_NUMBER
include euphoria/tokenize.e namespace tokenize public enum T_NUMBER
8.53.2.7 T_CHAR
include euphoria/tokenize.e namespace tokenize public enum T_CHAR
quoted character
8.53.2.8 T_STRING
include euphoria/tokenize.e namespace tokenize public enum T_STRING
string
8.53.2.9 T_IDENTIFIER
include euphoria/tokenize.e namespace tokenize public enum T_IDENTIFIER
8.53.2.10 T_KEYWORD
include euphoria/tokenize.e namespace tokenize public enum T_KEYWORD
8.53.2.11 T_DOUBLE_OPS
include euphoria/tokenize.e namespace tokenize public enum T_DOUBLE_OPS
8.53.2.12 T_PLUSEQ
include euphoria/tokenize.e namespace tokenize public enum T_PLUSEQ
8.53.2.13 T_MINUSEQ
include euphoria/tokenize.e namespace tokenize public enum T_MINUSEQ
8.53.2.14 T_MULTIPLYEQ
include euphoria/tokenize.e namespace tokenize public enum T_MULTIPLYEQ
8.53.2.15 T_DIVIDEEQ
include euphoria/tokenize.e namespace tokenize public enum T_DIVIDEEQ
8.53.2.16 T_LTEQ
include euphoria/tokenize.e namespace tokenize public enum T_LTEQ
8.53.2.17 T_GTEQ
include euphoria/tokenize.e namespace tokenize public enum T_GTEQ
8.53.2.18 T_NOTEQ
include euphoria/tokenize.e namespace tokenize public enum T_NOTEQ
8.53.2.19 T_CONCATEQ
include euphoria/tokenize.e namespace tokenize public enum T_CONCATEQ
8.53.2.20 T_DELIMITER
include euphoria/tokenize.e namespace tokenize public enum T_DELIMITER
8.53.2.21 T_PLUS
include euphoria/tokenize.e namespace tokenize public enum T_PLUS
8.53.2.22 T_MINUS
include euphoria/tokenize.e namespace tokenize public enum T_MINUS
8.53.2.23 T_MULTIPLY
include euphoria/tokenize.e namespace tokenize public enum T_MULTIPLY
8.53.2.24 T_DIVIDE
include euphoria/tokenize.e namespace tokenize public enum T_DIVIDE
8.53.2.25 T_LT
include euphoria/tokenize.e namespace tokenize public enum T_LT
8.53.2.26 T_GT
include euphoria/tokenize.e namespace tokenize public enum T_GT
8.53.2.27 T_NOT
include euphoria/tokenize.e namespace tokenize public enum T_NOT
8.53.2.28 T_CONCAT
include euphoria/tokenize.e namespace tokenize public enum T_CONCAT
8.53.2.29 T_SINGLE_OPS
include euphoria/tokenize.e namespace tokenize public enum T_SINGLE_OPS
8.53.2.30 T_EQ
include euphoria/tokenize.e namespace tokenize public enum T_EQ
8.53.2.31 T_LPAREN
include euphoria/tokenize.e namespace tokenize public enum T_LPAREN
8.53.2.32 T_RPAREN
include euphoria/tokenize.e namespace tokenize public enum T_RPAREN
8.53.2.33 T_LBRACE
include euphoria/tokenize.e namespace tokenize public enum T_LBRACE
8.53.2.34 T_RBRACE
include euphoria/tokenize.e namespace tokenize public enum T_RBRACE
8.53.2.35 T_LBRACKET
include euphoria/tokenize.e namespace tokenize public enum T_LBRACKET
8.53.2.36 T_RBRACKET
include euphoria/tokenize.e namespace tokenize public enum T_RBRACKET
8.53.2.37 T_QPRINT
include euphoria/tokenize.e namespace tokenize public enum T_QPRINT
8.53.2.38 T_COMMA
include euphoria/tokenize.e namespace tokenize public enum T_COMMA
8.53.2.39 T_PERIOD
include euphoria/tokenize.e namespace tokenize public enum T_PERIOD
8.53.2.40 T_COLON
include euphoria/tokenize.e namespace tokenize public enum T_COLON
8.53.2.41 T_DOLLAR
include euphoria/tokenize.e namespace tokenize public enum T_DOLLAR
8.53.2.42 T_SLICE
include euphoria/tokenize.e namespace tokenize public enum T_SLICE
8.53.2.43 T_WHITE
include euphoria/tokenize.e namespace tokenize public enum T_WHITE
8.53.2.44 T_BUILTIN
include euphoria/tokenize.e namespace tokenize public enum T_BUILTIN
8.53.2.45 T_TEXT
include euphoria/tokenize.e namespace tokenize public enum T_TEXT
8.53.2.46 TF_HEX
include euphoria/tokenize.e namespace tokenize public enum TF_HEX
8.53.3 T_NUMBER formats and T_types
8.53.3.1 TF_INT
include euphoria/tokenize.e namespace tokenize public enum TF_INT
8.53.3.2 TF_ATOM
include euphoria/tokenize.e namespace tokenize public enum TF_ATOM
8.53.3.3 TF_STRING_SINGLE
include euphoria/tokenize.e namespace tokenize public enum TF_STRING_SINGLE
8.53.3.4 TF_STRING_TRIPLE
include euphoria/tokenize.e namespace tokenize public enum TF_STRING_TRIPLE
8.53.3.5 TF_STRING_BACKTICK
include euphoria/tokenize.e namespace tokenize public enum TF_STRING_BACKTICK
8.53.3.6 TF_STRING_HEX
include euphoria/tokenize.e namespace tokenize public enum TF_STRING_HEX
8.53.3.7 TF_COMMENT_SINGLE
include euphoria/tokenize.e namespace tokenize public enum TF_COMMENT_SINGLE
8.53.3.8 TF_COMMENT_MULTIPLE
include euphoria/tokenize.e namespace tokenize public enum TF_COMMENT_MULTIPLE
8.53.4 Token accessors
8.53.4.1 TTYPE
include euphoria/tokenize.e namespace tokenize public enum TTYPE
8.53.4.2 TDATA
include euphoria/tokenize.e namespace tokenize public enum TDATA
8.53.4.3 TLNUM
include euphoria/tokenize.e namespace tokenize public enum TLNUM
8.53.4.4 TLPOS
include euphoria/tokenize.e namespace tokenize public enum TLPOS
8.53.4.5 TFORM
include euphoria/tokenize.e namespace tokenize public enum TFORM
8.53.5 ET error codes
8.53.5.1 ERR_NONE
include euphoria/tokenize.e namespace tokenize public enum ERR_NONE
8.53.5.2 ERR_OPEN
include euphoria/tokenize.e namespace tokenize public enum ERR_OPEN
8.53.5.3 ERR_ESCAPE
include euphoria/tokenize.e namespace tokenize public enum ERR_ESCAPE
8.53.5.4 ERR_EOL_CHAR
include euphoria/tokenize.e namespace tokenize public enum ERR_EOL_CHAR
8.53.5.5 ERR_CLOSE_CHAR
include euphoria/tokenize.e namespace tokenize public enum ERR_CLOSE_CHAR
8.53.5.6 ERR_EOL_STRING
include euphoria/tokenize.e namespace tokenize public enum ERR_EOL_STRING
8.53.5.7 ERR_HEX
include euphoria/tokenize.e namespace tokenize public enum ERR_HEX
8.53.5.8 ERR_DECIMAL
include euphoria/tokenize.e namespace tokenize public enum ERR_DECIMAL
8.53.5.9 ERR_UNKNOWN
include euphoria/tokenize.e namespace tokenize public enum ERR_UNKNOWN
8.53.5.10 ERR_EOF
include euphoria/tokenize.e namespace tokenize public enum ERR_EOF
8.53.5.11 ERR_EOF_STRING
include euphoria/tokenize.e namespace tokenize public enum ERR_EOF_STRING
8.53.5.12 ERR_HEX_STRING
include euphoria/tokenize.e namespace tokenize public enum ERR_HEX_STRING
8.53.5.13 error_string
include euphoria/tokenize.e namespace tokenize public function error_string(integer err)
Get an error message string for a given error code.
8.53.5.14 new
include euphoria/tokenize.e namespace tokenize public function new()
Create a new tokenizer state
See Also:
reset, tokenize_string, tokenize_file
8.53.5.15 reset
include euphoria/tokenize.e namespace tokenize public procedure reset(atom state = g_state)
Reset the state to begin parsing a new file
See Also:
new, tokenize_string, tokenize_file
8.53.6 get/set options
8.53.6.1 keep_builtins
include euphoria/tokenize.e namespace tokenize public procedure keep_builtins(integer val = 1, atom state = g_state)
Specify whether to identify builtins specially or not
default is FALSE
8.53.6.2 keep_keywords
include euphoria/tokenize.e namespace tokenize public procedure keep_keywords(integer val = 1, atom state = g_state)
Specify whether to identify keywords specially or not
default is TRUE
8.53.6.3 keep_whitespace
include euphoria/tokenize.e namespace tokenize public procedure keep_whitespace(integer val = 1, atom state = g_state)
Return white space (other than newlines) as tokens.
default is FALSE
8.53.6.4 keep_newlines
include euphoria/tokenize.e namespace tokenize public procedure keep_newlines(integer val = 1, atom state = g_state)
Return new lines as tokens.
default is FALSE
8.53.6.5 keep_comments
include euphoria/tokenize.e namespace tokenize public procedure keep_comments(integer val = 1, atom state = g_state)
Return comments as tokens
default is FALSE
8.53.6.6 return_literal_string
include euphoria/tokenize.e namespace tokenize public procedure return_literal_string(integer val = 1, atom state = g_state)
When returning string tokens, we have the option to process them and return their value, or to return the literal text that made up the original string.
Right now, this option only affects the processing of hex strings.
default is FALSE - process the string and return its value
8.53.6.7 string_strip_quotes
include euphoria/tokenize.e namespace tokenize public procedure string_strip_quotes(integer val = 1, atom state = g_state)
When returning string tokens, we have the option to strip the quotes.
default is TRUE
8.53.6.8 string_numbers
include euphoria/tokenize.e namespace tokenize public procedure string_numbers(integer val = 1, atom state = g_state)
Return TDATA for all T_NUMBER tokens in "string" format.
Defaults:
- T_NUMBER tokens return atoms
- T_CHAR tokens return single integer chars
- T_EOF tokens return undefined data
- Other tokens return strings
8.53.6.9 multiline_token
include euphoria/tokenize.e namespace tokenize public type multiline_token(object mlt)
8.53.6.10 last_multiline_token
include euphoria/tokenize.e namespace tokenize public function last_multiline_token()
Returns:
One of 0, TF_COMMENT_MULTIPLE, TF_STRING_BACKTICK, TF_STRING_TRIPLE.
Comments:
After calling tokenize_string, this function will return a value of 0 if the line did not end in the middle of a multiline construct, or the value for the respective token. This is meant to facilitate proper tokenizing of individual lines of code.
8.53.7 Routines
8.53.7.1 tokenize_string
include euphoria/tokenize.e namespace tokenize public function tokenize_string(sequence code, atom state = g_state, integer stop_on_error = TRUE, multiline_token multi = 0)
Tokenize euphoria source code
Parameters:
- code The code to be tokenized
- state (default g_state) the tokenizer returned by new
- stop_on_error (default TRUE)
- multi one of 0, TF_COMMENT_MULTIPLE, TF_STRING_BACKTICK, TF_STRING_TRIPLE
Returns:
Sequence of tokens
8.53.7.2 tokenize_file
include euphoria/tokenize.e namespace tokenize public function tokenize_file(sequence fname, atom state = g_state, integer mode = io :BINARY_MODE)
Tokenize euphoria source code
Parameters:
- fname the file to be read and tokenized
- state (default g_state) the tokenizer returned by new
- mode the mode in which to open the file. One of: io:BINARY_MODE (default) or io:TEXT_MODE. Note that for large files with Windows line endings, text mode may be much slower. See io:read_file for more information.
Returns:
Sequence of tokens
8.53.8 Debugging
8.53.8.1 token_names
include euphoria/tokenize.e namespace tokenize public constant token_names
Sequence containing token names for debugging
8.53.8.2 token_forms
include euphoria/tokenize.e namespace tokenize public constant token_forms
8.53.8.3 show_tokens
include euphoria/tokenize.e namespace tokenize public procedure show_tokens(integer fh, sequence tokens)
Print token names and data for each token in `tokens` to the file handle `fh`
Parameters:
- fh - file handle to print information to
- tokens - token sequence to print
Comments:
This does not take direct output from tokenize_string or tokenize_file. Instead they take the first element of their return value, the token stream only.
See Also:
tokenize_string, tokenize_file