-
The Debug mode accept command line arguments
argv[]which is set to default values in VS project configurations when running Debug mode, program automaticly loads codes from file /demos/1.txt -
The Develop mode invoke the function
void debug()which is defined in /src/develop.cpp -
The Release is same as the Debug mode except for default values which is not set
Currently, CFortranTranslator use /src/grammar/simple_lexer.cpp as tokenizer, /src/grammar/for90.y as parser
#define USE_LEX to enable tokenizer generated by flex from /src/grammar/for90.l
#undef USE_LEX to enable simple_lexer
simple_lexer is a more flexible tokenizer in order to handle some features of fortran
e.g. Fortran's continuation can exist between a token
inte
*ger :: a(10)
The parser is generated by bison from /src/grammar/for90.y
The parser calls int yylex(void) to get one token from tokenizer at a time
int yylex(void) is defined to pure_yylex when using flex, and simple_yylex when using simple_lexer
In /src/parser/parser.h, several helper macros are defined to help managing nodes. These macros are defined to different value according to different memory management strategies. There behaviour is controlled by defining USE_TRIVIAL/USE_POINTER or none of them.
YY2ARGget aParseNode &from bison's arguments$n$ncan beParseNodeorParseNode *RETURN_NTgenerates a bison's result$$(namelyParseNode *) fromParseNode,RETURN_NTdoes opposite work ofYY2ARG$$can beParseNodeorParseNode *CLEAN_RIGHT/CLEAN_DELETE/CLEAN_REUSEclear all bison's argumentsCLEAN_RIGHT: a general rule which is deprecated and replaced by the following two casesCLEAN_DELETE: tokens can't be reused in generated NT, usually are NTsCLEAN_REUSE: tokens can be reused in generated NT, usually are Ts.CLEAN_REUSEmay be or may not be reused in differenct situations:USE_TRIVIAL: ReuseUSE_POINTER: Reuse- else: Copy
- e.g.
case_stmt_elem : YY_CASE '(' dimen_slice ')' at_least_one_end_line suiteCLEAN_DELETEincludesYY_CASE,'(',')',at_least_one_end_lineCLEAN_REUSEincludesdimen_slice,suite
This translator supports a subset of Fortran90's grammar. Grammar can be extended by adding new rules into /src/grammar/for90.y
- Declare new token by
%tokenin /src/grammar/for90.y - Add pattern of this token in /src/grammar/for90.l
- Add rules related to the token in /src/grammar/for90.y
- Update bytecodes and grammar tokens in /src/parser/Intent.h
- Register keyword in /src/parser/tokenizer.cpp(if this token is keyword)
- If this keyword are made up of more than one words, reduction conflicts may be caused between the whole keyword and its prefix, so this keyword must be handled specially by:
- if this keyword is made up of non-word symbols without spaces between them. e.g.
(/is made up of(and/. Just add the whole word as a rule to for90.l - if rule's keyword is made up words seperated by spaces, like
else ifadd a new item intoforward1in /src/parser/tokenizer.cpp the first partelsedon't need to be registered as a keyword
- if this keyword is made up of non-word symbols without spaces between them. e.g.
- Update translation rules in /src/target/gen_config.h
This for90std library implements a subset of Fortran90's intrinsic functions.
- Implement this function and included it in for90std/for90std.h
- if a parameter is optional in fortran, wrap it with
foroptional, and log all parameters of this function in /gen_config.cpp - if the parameter is the only optional parameter, can omit
foroptionalwrapper
- if a parameter is optional in fortran, wrap it with
- Update
funcname_mapin /src/target/gen_config.cpp if necessary
In this section, serveral specific generating rules are discussed
argtable, dimen_slice, pure_paramtable are a list of different items seprated by ,
argtableis a list ofexp(refis_exp())dimen_sliceis a list ofslice(NT_SLICE) orexppure_paramtableis a list ofkeyvalue(NT_KEYVALUE/NT_VARIABLE_ENTITY) orsliceorexppure_paramtablewill be re-generated inregen_function_arrayin /src/target/gen_callable.cppparamtableisargtableordimen_sliceorpure_paramtable
argtable+slice=dimen_slice, all elements inargtablewill be promote toslice(with one child)argtable+keyvalue=pure_paramtable, all elements inargtablewill be promote tokeyvaluedimen_slice+keyvalueorpure_paramtable+sliceis illegal
(3) The suffix "- spec" is used consistently for specifiers, such as keyword actual arguments and input / output statement specifiers.It also is used for type declaration attribute specifications(for example, "array - spec" in R512), and in a few other cases.
(4) When reference is made to a type parameter, including the surrounding parentheses, the term "selector" is used.See, for example, "length - selector"(R507) and "kind - selector"(R505).
You can use REAL(x) to get the float copy of x, however, you can also use REAL(kind = 8) to specify a floating number which is same to long double rather than double, so it may cause conflict.
To specify, type_name is like INTEGER and a type_spec is like INTEGER(kind = 4), type_nospec can be head of callable, type_spec is not.
NT_FUCNTIONARRAY and NT_HIDDENDO will NOT be promote to NT_EXPRESSION
stmtis statement end with ';' or '\n'suiteis a set ofstmt
| rules | left side | right side |
|---|---|---|
| fortran_program | root | wrappers |
| wrappers | wrapper + | |
| wrapper | / | function_decl / program |
| function_decl | NT_FUNCTIONDECLARE | |
| var_def | NT_VARIABLEDEFINE/NT_DECLAREDVARIABLE | |
| keyvalue | NT_VARIABLE_ENTITY(namely NT_KEYVALUE) | variable, NT_EXPRESSION / NT_VARIABLEINITIALDUMMY |
| NT_VARIABLE_ENTITY | variable, exp | |
| suite | NT_SUITE | stmt |
| stmt | exp / var_def / compound_stmt / output_stmt / input_stmt / dummy_stmt / let_stmt / jump_stmt / interface_decl | |
| NT_ARRAYBUILDER_LIST | (NT_HIDDENDO / NT_FUNCTIONARRAY / exp) | |
| type_spec | type_name / (type_name, type_selector) |
When using lazy gen strategy, the node of non-terminal on the left side can change nodes of non-terminals on the right side. which means the AST is not immutable.
-
regen_functionregen_functions are declared in /src/target/codegen.hA
regen_function will change its inputParseNode & -
gen_functiongen_functions are declared in /src/target/codegen.hA
gen_function will not change its inputconst ParseNode &. Thegen_function uses it input to generate and return a newParseNode. The function may copy part of it's input as its child nodes. -
gen_xxx_reusedfunctiongen_xxx_reusedfunction has a form likegen_xxx_reused, are declared in /src/target/codegen.hDifferent from
gen_function, agen_reusedfunction has inputParseNode &. Instead of copy some of its input likegen_functions do, agen_reusedreuse some its input, by adding pointers to them directly.See parser:macros for more
Due to fortran's feature of implicit declaration, code above stmt level, including function_decl, program can only be re-generated with correct type after the whole AST is built, by following steps:
gen_fortran_programhandlesprogram,function_declregen_suitehandlessuiteruleregen_commongeneratescommonstatement code
The implicit declaration feature should also be considered when generating variables and functions, see variable definition/generate functions.
Many type names and function names are mapped in order to avoid possible conflicts, the mapping is defined by pre_map and funcname_map in /src/target/gen_config.h
pre_map and funcname_map have different usages:
-
Mappings defined in
pre_mapare checked when doing tokenizing in /src/grammar/for90.l. In this stage, keywords and operators are replaced -
Mappings defined in
funcname_mapare checked when callingregen_function_arrayin /src/target/gen_callable.cpp. In this stage, only intrinsic functions' names are replaced to avoid possible confliction with other functions.
Currently, variable is generated lazily, so the whole process happens after the AST is built.
-
Step 1:
-
Case 1: when encountering a
UnknownVariantduring parsing (regen_exp, after the AST is built):** Variables are registered to symbol table mostly in the condition** All variables, once reached by the parser will be registered to symbol table, by calling
check_implicit_variable.check_implicit_variablechecks whether this variable has been registered togen_context().variablesby the otehr cases already. If this variable hasn't been registed,check_implicit_variablewill register it togen_context().variablesby:- Add
VariableInfonode .typeis deduced by its name in functiongen_implicit_type- Set
.implicit_defined=true, if an explicit declaration of this variable is found later,.implicit_definedwill be set back tofalseautomatically. .vardefis pointer(!= nullptr) to aParseNodenode.- Set
commonblock_name="",commonblock_index=0, if this variable is found belong to a common block, this two fields will be set.
- Add
-
Case 2: when encounter
NT_COMMONBLOCK(inregen_stmt, after the AST is built):This is an explicit definition
Call
regen_commonto registerVariableInfointogen_context().variables, markcommonblock_nameandcommonblock_index -
Case 3: when encounter
NT_VARIABLEDEFINESETandNT_VARIABLEDEFINE(inregen_stmt, after the AST is built):This is an explicit definition
These two nodes are generated into
NT_VARIABLEDEFINESETandNT_VARIABLEDEFINE, in /src/grammar/for90.y. Register(add a newVariableInfoonly when the variable is never used, or modify the exsitingVariableInfo, for the most cases) correspondingVariableInfoto symbol table.
Step 1 works simultanously with
regen_functions below suite level. After Step 1, all variables, whether explicit or implicit, are registered. However there is an exception that all implicit variables (e.g.A) used to initialize an variable (e.g.B) will not be registed untilregen_vardefis called toB, in Step 2integer A = B + 1 ! B is not registered -
-
Step 2:
This procedure is defined in the function
regen_all_variables, which includes awhile-loop in whichregen_vardefis called to every variable. Thiswhile-loop can't be replaced by an 1-passfor-loop, becauseregen_vardefmay introduce new variables, according to Step 1. After thewhile-loop, allVariableInfoof this suite are generated(and their.generatefield will all set totrue). -
Step 3:
This procedure is defined in the function
regen_all_variables_decl_str, according to function's 2-phase generating strategy.regen_all_variables_decl_stris called inregen_function_2, it will generate code for variables generated in Step 2. The generated codes depends on whether this variable is common block.
Common block is in global name space(finfo = get_function("", ""), vinfo = get_commonblock(commonblock_name)->variables[commonblock_index] = get_variable("", "BLOCK_" + commonblock_name, local_varname))
-
All items in interface are firstly variables, so it will be
- registered by
add_variableunderfinfo->local_name - its
local_namewill be exactly the item's name
- registered by
-
All items in interface can also considered to be function(with no function body), so it will be
- registered by
add_functionunderget_context().current_module - its
local_namewill befinfo->local_name + "@" + the item's name
- registered by
Functions in fortran are strongly connected:
-
Functions shares common blocks, common blocks have relationship with body of function it belongs to. so variable decl part of a function must generated after all information of common block is gathered, which requires more than 1-pass scan of all functions.
-
Call function with keyword arguments needs the body of callee function parsed
All parse tree nodes are defined in /src/Intent.h with an NT_ prefix
- fs:
- fs.CurrentTerm.what: immediate-generated code, generated from child's
fs.CurrentTerm.what, or from other infomations - fs.CurrentTerm.token: refer /src/Intent.h
- fs.CurrentTerm.what: immediate-generated code, generated from child's
- child
- attr:
attrs including
- FunctionAttr
- VariableDescAttr
- father: pointer to parent node
Child ParseNode may also be referred when generating upper level ParseNode, so do not change child index of:
NT_VARIABLE_ENTITY: referred infunction_declNT_FUNCTIONDECLARE: can represent interface, referred inparamtableandfunction_decl
All variables(including commom block) and functions is now logged in /src/Variable.h and /src/Function.h by
VariableInfo and FunctionInfo
| Item | Rule |
|---|---|
| kind | typecast_spec |
| len | typecast_spec |
| dimension | variable_desc_elem |
| intent | variable_desc_elem |
| optional | variable_desc_elem |
| parameter | variable_desc_elem |
-> means a ParseNode has this ParseAttr
| ParseAttr | Usage |
|---|---|
VariableDescAttr |
NT_DECLAREDVARIABLE or NT_VARIABLEDEFINE or NT_VARIABLEINTIAL nodes of NT_VARIABLEDEFINE.NT_PARAMTABLE_PURE |
FunctionAttr |
NT_FUNCTIONDECLARE |
VarialbeAttr |
NT_FUNCTIONDECLARE |