Source code to token conversion. More...

Functions
size_t	tokcnt (const char *const line)
	Counts the number of tokens in a given string (or file content).

void	toknz_segtoset (tokset_t const set, const size_t token_index, const char const line, const size_t start, const size_t end, const size_t line_no, const tokcat_e category, const size_t column)
	Tokenizes a segment of a line and stores the resulting token in the token set.

tokset_t *	toknz (const char *const line)
	Tokenizes a line (or multiple lines of code) into a set of tokens.

Detailed Description

Source code to token conversion.

Function Documentation

◆ tokcnt()

size_t tokcnt ( const char *const line )

Counts the number of tokens in a given string (or file content).

This function counts the number of tokens in a provided line of text (or a whole block of code, if the entire file content is passed). It categorizes tokens into string literals, character literals, operators, punctuations, and identifiers/keywords/literals. The function processes the string character by character and considers escape sequences where applicable.

Parameters

line	A constant pointer to a string containing the code to be analyzed. This string can represent a single line of code or the entire content of a file

Returns: The total number of tokens found in the given string.

Handles:

String/character literals (with escape sequences)
Operators and punctuation
Identifiers/keywords
Preprocessor directives

◆ toknz()

tokset_t * toknz ( const char *const line )

Tokenizes a line (or multiple lines of code) into a set of tokens.

This function processes the given string (or entire code block) and divides it into individual tokens. It supports string literals, character literals, operators, punctuations, separators, preprocessor directives, identifiers, and keywords. The tokens are stored in the provided token set.

The function uses toknz_segtoset to tokenize individual segments of the line and store them in the token set. It ensures that each token is associated with its type, position (line number, column), and appropriate category (e.g., literal, operator, identifier).

If the number of tokens in the resulting set does not match the expected token count, the function reports an error and returns NULL. It processes both single lines and whole files, depending on the input provided.

Parameters

line	A constant pointer to the string (or code block) to be tokenized. This can represent a single line or the entire content of a file.

Returns: A pointer to the token set (tokset_t) containing all identified tokens. Returns NULL if an error occurs during tokenization (e.g., mismatch in expected token count).

Return values

NULL	on tokenization error

Processes:

Multi-line inputs
All token types
Maintains position information

◆ toknz_segtoset()

void toknz_segtoset	(	tokset_t *const	set,
		const size_t	token_index,
		const char *const	line,
		const size_t	start,
		const size_t	end,
		const size_t	line_no,
		const tokcat_e	category,
		const size_t	column )

Tokenizes a segment of a line and stores the resulting token in the token set.

This function extracts a substring from the given line, determines its token type based on the specified token category, and stores the token in the provided token set at the specified index. It handles memory allocation for the substring, creates a token using the tok_ptor constructor, and stores the token in the set->toks array. If memory allocation fails, it sets the token to NULL.

Parameters

set	A pointer to the `tokset_t` object in which the token will be stored.
token_index	The index in the token set where the token will be stored.
line	A constant pointer to the string containing the source line of code.
start	The starting index of the token within the line.
end	The ending index of the token within the line.
line_no	The line number where the token appears in the source code.
category	The category of the token, from `tokcat_e` (e.g., literals, symbols).
column	The starting column number of the token within the line.

Note: Allocates memory for token value

Warning: Sets token to NULL on allocation failure

Functions

Detailed Description

Function Documentation

◆ tokcnt()

◆ toknz()

◆ toknz_segtoset()