routine or constant name search

8.53 Euphoria Source Tokenizer

8.53.1 tokenize return sequence key

8.53.1.1 ET_TOKENS

include euphoria/tokenize.e
namespace tokenize
public enum ET_TOKENS

8.53.1.2 ET_ERROR

include euphoria/tokenize.e
namespace tokenize
public enum ET_ERROR

8.53.1.3 ET_ERR_LINE

include euphoria/tokenize.e
namespace tokenize
public enum ET_ERR_LINE

8.53.1.4 ET_ERR_COLUMN

include euphoria/tokenize.e
namespace tokenize
public enum ET_ERR_COLUMN

8.53.2 Tokens

8.53.2.1 T_EOF

include euphoria/tokenize.e
namespace tokenize
public enum T_EOF

8.53.2.2 T_NULL

include euphoria/tokenize.e
namespace tokenize
public enum T_NULL

8.53.2.3 T_SHBANG

include euphoria/tokenize.e
namespace tokenize
public enum T_SHBANG

8.53.2.4 T_NEWLINE

include euphoria/tokenize.e
namespace tokenize
public enum T_NEWLINE

8.53.2.5 T_COMMENT

include euphoria/tokenize.e
namespace tokenize
public enum T_COMMENT

8.53.2.6 T_NUMBER

include euphoria/tokenize.e
namespace tokenize
public enum T_NUMBER

8.53.2.7 T_CHAR

include euphoria/tokenize.e
namespace tokenize
public enum T_CHAR

quoted character

8.53.2.8 T_STRING

include euphoria/tokenize.e
namespace tokenize
public enum T_STRING

string

8.53.2.9 T_IDENTIFIER

include euphoria/tokenize.e
namespace tokenize
public enum T_IDENTIFIER

8.53.2.10 T_KEYWORD

include euphoria/tokenize.e
namespace tokenize
public enum T_KEYWORD

8.53.2.11 T_DOUBLE_OPS

include euphoria/tokenize.e
namespace tokenize
public enum T_DOUBLE_OPS

8.53.2.12 T_PLUSEQ

include euphoria/tokenize.e
namespace tokenize
public enum T_PLUSEQ

8.53.2.13 T_MINUSEQ

include euphoria/tokenize.e
namespace tokenize
public enum T_MINUSEQ

8.53.2.14 T_MULTIPLYEQ

include euphoria/tokenize.e
namespace tokenize
public enum T_MULTIPLYEQ

8.53.2.15 T_DIVIDEEQ

include euphoria/tokenize.e
namespace tokenize
public enum T_DIVIDEEQ

8.53.2.16 T_LTEQ

include euphoria/tokenize.e
namespace tokenize
public enum T_LTEQ

8.53.2.17 T_GTEQ

include euphoria/tokenize.e
namespace tokenize
public enum T_GTEQ

8.53.2.18 T_NOTEQ

include euphoria/tokenize.e
namespace tokenize
public enum T_NOTEQ

8.53.2.19 T_CONCATEQ

include euphoria/tokenize.e
namespace tokenize
public enum T_CONCATEQ

8.53.2.20 T_DELIMITER

include euphoria/tokenize.e
namespace tokenize
public enum T_DELIMITER

8.53.2.21 T_PLUS

include euphoria/tokenize.e
namespace tokenize
public enum T_PLUS

8.53.2.22 T_MINUS

include euphoria/tokenize.e
namespace tokenize
public enum T_MINUS

8.53.2.23 T_MULTIPLY

include euphoria/tokenize.e
namespace tokenize
public enum T_MULTIPLY

8.53.2.24 T_DIVIDE

include euphoria/tokenize.e
namespace tokenize
public enum T_DIVIDE

8.53.2.25 T_LT

include euphoria/tokenize.e
namespace tokenize
public enum T_LT

8.53.2.26 T_GT

include euphoria/tokenize.e
namespace tokenize
public enum T_GT

8.53.2.27 T_NOT

include euphoria/tokenize.e
namespace tokenize
public enum T_NOT

8.53.2.28 T_CONCAT

include euphoria/tokenize.e
namespace tokenize
public enum T_CONCAT

8.53.2.29 T_SINGLE_OPS

include euphoria/tokenize.e
namespace tokenize
public enum T_SINGLE_OPS

8.53.2.30 T_EQ

include euphoria/tokenize.e
namespace tokenize
public enum T_EQ

8.53.2.31 T_LPAREN

include euphoria/tokenize.e
namespace tokenize
public enum T_LPAREN

8.53.2.32 T_RPAREN

include euphoria/tokenize.e
namespace tokenize
public enum T_RPAREN

8.53.2.33 T_LBRACE

include euphoria/tokenize.e
namespace tokenize
public enum T_LBRACE

8.53.2.34 T_RBRACE

include euphoria/tokenize.e
namespace tokenize
public enum T_RBRACE

8.53.2.35 T_LBRACKET

include euphoria/tokenize.e
namespace tokenize
public enum T_LBRACKET

8.53.2.36 T_RBRACKET

include euphoria/tokenize.e
namespace tokenize
public enum T_RBRACKET

8.53.2.37 T_QPRINT

include euphoria/tokenize.e
namespace tokenize
public enum T_QPRINT

8.53.2.38 T_COMMA

include euphoria/tokenize.e
namespace tokenize
public enum T_COMMA

8.53.2.39 T_PERIOD

include euphoria/tokenize.e
namespace tokenize
public enum T_PERIOD

8.53.2.40 T_COLON

include euphoria/tokenize.e
namespace tokenize
public enum T_COLON

8.53.2.41 T_DOLLAR

include euphoria/tokenize.e
namespace tokenize
public enum T_DOLLAR

8.53.2.42 T_SLICE

include euphoria/tokenize.e
namespace tokenize
public enum T_SLICE

8.53.2.43 T_WHITE

include euphoria/tokenize.e
namespace tokenize
public enum T_WHITE

8.53.2.44 T_BUILTIN

include euphoria/tokenize.e
namespace tokenize
public enum T_BUILTIN

8.53.2.45 T_TEXT

include euphoria/tokenize.e
namespace tokenize
public enum T_TEXT

8.53.2.46 TF_HEX

include euphoria/tokenize.e
namespace tokenize
public enum TF_HEX

8.53.3 T_NUMBER formats and T_types

8.53.3.1 TF_INT

include euphoria/tokenize.e
namespace tokenize
public enum TF_INT

8.53.3.2 TF_ATOM

include euphoria/tokenize.e
namespace tokenize
public enum TF_ATOM

8.53.3.3 TF_STRING_SINGLE

include euphoria/tokenize.e
namespace tokenize
public enum TF_STRING_SINGLE

8.53.3.4 TF_STRING_TRIPLE

include euphoria/tokenize.e
namespace tokenize
public enum TF_STRING_TRIPLE

8.53.3.5 TF_STRING_BACKTICK

include euphoria/tokenize.e
namespace tokenize
public enum TF_STRING_BACKTICK

8.53.3.6 TF_STRING_HEX

include euphoria/tokenize.e
namespace tokenize
public enum TF_STRING_HEX

8.53.3.7 TF_COMMENT_SINGLE

include euphoria/tokenize.e
namespace tokenize
public enum TF_COMMENT_SINGLE

8.53.3.8 TF_COMMENT_MULTIPLE

include euphoria/tokenize.e
namespace tokenize
public enum TF_COMMENT_MULTIPLE

8.53.4 Token accessors

8.53.4.1 TTYPE

include euphoria/tokenize.e
namespace tokenize
public enum TTYPE

8.53.4.2 TDATA

include euphoria/tokenize.e
namespace tokenize
public enum TDATA

8.53.4.3 TLNUM

include euphoria/tokenize.e
namespace tokenize
public enum TLNUM

8.53.4.4 TLPOS

include euphoria/tokenize.e
namespace tokenize
public enum TLPOS

8.53.4.5 TFORM

include euphoria/tokenize.e
namespace tokenize
public enum TFORM

8.53.5 ET error codes

8.53.5.1 ERR_NONE

include euphoria/tokenize.e
namespace tokenize
public enum ERR_NONE

8.53.5.2 ERR_OPEN

include euphoria/tokenize.e
namespace tokenize
public enum ERR_OPEN

8.53.5.3 ERR_ESCAPE

include euphoria/tokenize.e
namespace tokenize
public enum ERR_ESCAPE

8.53.5.4 ERR_EOL_CHAR

include euphoria/tokenize.e
namespace tokenize
public enum ERR_EOL_CHAR

8.53.5.5 ERR_CLOSE_CHAR

include euphoria/tokenize.e
namespace tokenize
public enum ERR_CLOSE_CHAR

8.53.5.6 ERR_EOL_STRING

include euphoria/tokenize.e
namespace tokenize
public enum ERR_EOL_STRING

8.53.5.7 ERR_HEX

include euphoria/tokenize.e
namespace tokenize
public enum ERR_HEX

8.53.5.8 ERR_DECIMAL

include euphoria/tokenize.e
namespace tokenize
public enum ERR_DECIMAL

8.53.5.9 ERR_UNKNOWN

include euphoria/tokenize.e
namespace tokenize
public enum ERR_UNKNOWN

8.53.5.10 ERR_EOF

include euphoria/tokenize.e
namespace tokenize
public enum ERR_EOF

8.53.5.11 ERR_EOF_STRING

include euphoria/tokenize.e
namespace tokenize
public enum ERR_EOF_STRING

8.53.5.12 ERR_HEX_STRING

include euphoria/tokenize.e
namespace tokenize
public enum ERR_HEX_STRING

8.53.5.13 error_string

include euphoria/tokenize.e
namespace tokenize
public function error_string(integer err)

Get an error message string for a given error code.

8.53.5.14 new

include euphoria/tokenize.e
namespace tokenize
public function new()

Create a new tokenizer state

See Also:

reset, tokenize_string, tokenize_file

8.53.5.15 reset

include euphoria/tokenize.e
namespace tokenize
public procedure reset(atom state = g_state)

Reset the state to begin parsing a new file

See Also:

new, tokenize_string, tokenize_file

8.53.6 get/set options

8.53.6.1 keep_builtins

include euphoria/tokenize.e
namespace tokenize
public procedure keep_builtins(integer val = 1, atom state = g_state)

Specify whether to identify builtins specially or not

default is FALSE

8.53.6.2 keep_keywords

include euphoria/tokenize.e
namespace tokenize
public procedure keep_keywords(integer val = 1, atom state = g_state)

Specify whether to identify keywords specially or not

default is TRUE

8.53.6.3 keep_whitespace

include euphoria/tokenize.e
namespace tokenize
public procedure keep_whitespace(integer val = 1, atom state = g_state)

Return white space (other than newlines) as tokens.

default is FALSE

8.53.6.4 keep_newlines

include euphoria/tokenize.e
namespace tokenize
public procedure keep_newlines(integer val = 1, atom state = g_state)

Return new lines as tokens.

default is FALSE

8.53.6.5 keep_comments

include euphoria/tokenize.e
namespace tokenize
public procedure keep_comments(integer val = 1, atom state = g_state)

Return comments as tokens

default is FALSE

8.53.6.6 return_literal_string

include euphoria/tokenize.e
namespace tokenize
public procedure return_literal_string(integer val = 1, atom state = g_state)

When returning string tokens, we have the option to process them and return their value, or to return the literal text that made up the original string.

Right now, this option only affects the processing of hex strings.

default is FALSE - process the string and return its value

8.53.6.7 string_strip_quotes

include euphoria/tokenize.e
namespace tokenize
public procedure string_strip_quotes(integer val = 1, atom state = g_state)

When returning string tokens, we have the option to strip the quotes.

default is TRUE

8.53.6.8 string_numbers

include euphoria/tokenize.e
namespace tokenize
public procedure string_numbers(integer val = 1, atom state = g_state)

Return TDATA for all T_NUMBER tokens in "string" format.

Defaults:
  • T_NUMBER tokens return atoms
  • T_CHAR tokens return single integer chars
  • T_EOF tokens return undefined data
  • Other tokens return strings

8.53.6.9 multiline_token

include euphoria/tokenize.e
namespace tokenize
public type multiline_token(object mlt)

8.53.6.10 last_multiline_token

include euphoria/tokenize.e
namespace tokenize
public function last_multiline_token()
Returns:

One of 0, TF_COMMENT_MULTIPLE, TF_STRING_BACKTICK, TF_STRING_TRIPLE.

Comments:

After calling tokenize_string, this function will return a value of 0 if the line did not end in the middle of a multiline construct, or the value for the respective token. This is meant to facilitate proper tokenizing of individual lines of code.

8.53.7 Routines

8.53.7.1 tokenize_string

include euphoria/tokenize.e
namespace tokenize
public function tokenize_string(sequence code, atom state = g_state,
        integer stop_on_error = TRUE, multiline_token multi = 0)

Tokenize euphoria source code

Parameters:
  1. code The code to be tokenized
  2. state (default g_state) the tokenizer returned by new
  3. stop_on_error (default TRUE)
  4. multi one of 0, TF_COMMENT_MULTIPLE, TF_STRING_BACKTICK, TF_STRING_TRIPLE
Returns:

Sequence of tokens

8.53.7.2 tokenize_file

include euphoria/tokenize.e
namespace tokenize
public function tokenize_file(sequence fname, atom state = g_state,
        integer mode = io :BINARY_MODE)

Tokenize euphoria source code

Parameters:
  1. fname the file to be read and tokenized
  2. state (default g_state) the tokenizer returned by new
  3. mode the mode in which to open the file. One of: io:BINARY_MODE (default) or io:TEXT_MODE. Note that for large files with Windows line endings, text mode may be much slower. See io:read_file for more information.
Returns:

Sequence of tokens

8.53.8 Debugging

8.53.8.1 token_names

include euphoria/tokenize.e
namespace tokenize
public constant token_names

Sequence containing token names for debugging

8.53.8.2 token_forms

include euphoria/tokenize.e
namespace tokenize
public constant token_forms

8.53.8.3 show_tokens

include euphoria/tokenize.e
namespace tokenize
public procedure show_tokens(integer fh, sequence tokens)

Print token names and data for each token in `tokens` to the file handle `fh`

Parameters:
  • fh - file handle to print information to
  • tokens - token sequence to print
Comments:

This does not take direct output from tokenize_string or tokenize_file. Instead they take the first element of their return value, the token stream only.

See Also:

tokenize_string, tokenize_file