Data Structures | Functions

lexer.h File Reference

Structures and functions for separating a character buffer into lexemes. More...

Data Structures

struct  Lexeme
 Stores a lexeme. More...
struct  LexemeList
 Stores a list of lexemes. More...

Functions

LexemeaddLexeme (LexemeList *, Lexeme *)
 Adds a Lexeme structure to a LexemeList structure.
LexemecreateLexeme (char *, const char *, unsigned int)
 Creates a Lexeme structure.
LexemeListcreateLexemeList (void)
 Creates a LexemeList structure.
void deleteLexeme (Lexeme *)
 Deletes a Lexeme structure.
void deleteLexemeList (LexemeList *)
 Deletes a LexemeList structure.
LexemeListscanBuffer (const char *, unsigned int, const char *)
 Scans through a character buffer, removing unecessary characters and generating lexemes.

Detailed Description

Structures and functions for separating a character buffer into lexemes.

The lexer reads through a buffer of characters (themselves typically read from standard input), strips whitespace, and breaks them up into logical atoms of character strings which, in turn, may be passed on to later processes (such as a tokenizer).

Author:
Justin J. Meza
Date:
2010

Function Documentation

Lexeme* addLexeme ( LexemeList list,
Lexeme lexeme 
)

Adds a Lexeme structure to a LexemeList structure.

Precondition:
list was created by createLexemeList(void).
lexeme was created by createLexeme(char *, const char *, unsigned int).
Postcondition:
lexeme will be added on to the end of list and the size of list will be updated accordingly.
Returns:
A pointer to the added Lexeme structure (will be the same as lexeme).
Return values:
NULL realloc was unable to allocate memory.
Parameters:
[in,out] list A pointer to the LexemeList structure to add lex to.
[in] lexeme A pointer to the Lexeme structure to add to list.
Lexeme* createLexeme ( char *  image,
const char *  fname,
unsigned int  line 
)

Creates a Lexeme structure.

Returns:
A pointer to a Lexeme structure with the desired properties.
Return values:
NULL malloc was unable to allocate memory.
See also:
deleteLexeme(Lexeme *)

Note:
fname is not copied because it would only one copy is stored for all Lexeme structures that share it.

Parameters:
[in] image An array of characters that describe the lexeme.
[in] fname A pointer to the name of the file containing the lexeme.
[in] line The line number from the source file that the lexeme occurred on.
LexemeList* createLexemeList ( void   ) 

Creates a LexemeList structure.

Returns:
A pointer to a LexemeList structure with the desired properties.
Return values:
NULL malloc was unable to allocate memory.
See also:
deleteLexemeList(LexemeList *)
void deleteLexeme ( Lexeme lexeme  ) 

Deletes a Lexeme structure.

Precondition:
lexeme points to a Lexeme structure created by createLexeme(char *, const char *, unsigned int).
Postcondition:
The memory at lexeme and all of its elements will be freed.
See also:
createLexeme(char *, const char *, unsigned int)

Note:
We do not free (*lex)->fname because it is shared between many Lexeme structures and is free'd by whoever created them.

void deleteLexemeList ( LexemeList list  ) 

Deletes a LexemeList structure.

Precondition:
list was created by createLexemeList(void) and contains items added by addLexeme(LexemeList *, Lexeme *).
Postcondition:
The memory at list and any of its associated members will be freed.
See also:
createLexemeList(void)
Parameters:
[in,out] list A pointer to the LexemeList structure to delete.
LexemeList* scanBuffer ( const char *  buffer,
unsigned int  size,
const char *  fname 
)

Scans through a character buffer, removing unecessary characters and generating lexemes.

Lexemes are separated by whitespace (but newline characters are kept as their own lexeme). String literals are handled a bit differently: starting at the first quotation character, characters are collected until either an unescaped quotation character is read (that is, a quotation character not preceeded by a colon which itself is not proceeded by a colon) or a newline or carriage return character is read, whichever comes first. This handles the odd case of strings such as "::" which print out a single colon. Also handled are the effects of commas, ellipses, and bangs (!).

Precondition:
size is the number of characters starting at the memory location pointed to by buffer.
Returns:
A pointer to a LexemeList structure.
Parameters:
[in] buffer An array of characters to tokenize.
[in] size The number of characters in buffer.
[in] fname An array of characters representing the name of the file used to read buffer.
 All Data Structures Files Functions Variables Enumerations Enumerator Defines