Main People Publications Research Tools

Aristotle Analysis System

Java Architecture for Bytecode Analysis (JABA)

Tarantula Fault Localization System

 

Aristotle System Manual

Back

Contents


1. Introduction

By using the Aristotle menus, you can run Aristotle analysis tools and view the data output by those tools. You may wish, however, to write your own analysis tools, or run Aristotle tools from the command line or scripts. This document describes the aspects of Aristotle that are relevant to such efforts.

Important. Before you can run any part of the Aristotle system, you must follow the directions on the User Manual, section 2, to setup your user account.

Section 2 of this manual describes the Aristotle database, and tells you how to write tools that interface with the database. Section 3 of this manual describes Aristotle tools that you can invoke, and describes the methods for invoking them.

2. Database and database handler interface

When Aristotle analyzes programs, it stores the results of its analyses in a "database". The database, currently implemented as a directory of files, is accessed by "database handler routines". To write analysis programs that use information contained in the database, you use these handler routines. Analysis information is stored in the database in formatted files. Because all information is accessed by handler routines, you do not need to know about particular database files or file formats. In fact, we recommend that you not write code that accesses database files directly: instead, use the handlers. File formats may change in subsequent Aristotle releases, so if you do write programs that access files directly, new releases of Aristotle might break your programs. Handler interfaces will not change, and thus applications that use handlers will continue to function in subsequent releases.

Section 2.1 explains how to use the handler interface routines for Aristotle. An example program named cf_printer.c is included in the src subdirectory of the release. A great deal of information can also be found by reading the header files.


2.1 Using Existing Aristotle Handler Routines

Aristotle takes a C source file that contains one or more functions, and collects data for each of the functions. Handler routines typically retrieve data for the source file on a function by function basis. For example, suppose you have a C source file called "temp.c" that contains functions F1, F2, and F3. You run it through the parser/analyzer, which deposits analysis data in the database. Now, you can retrieve parser/analyzer data for each of the functions using handlers.

Suppose you want to write an applications program that uses some of the parser/analyzer data stored in the database. In general, do the following:

In your application, where it is finished accessing database handlers, you insert calls to the "end" procedure for each handler you have used. You can also call the "free" routine(s) for those handlers, to release memory allocated dynamically for information structures returned to you by read calls.

Handler routines that write information to the database are also available. In this case, you use a "create" routine in place of the "begin" routine to begin accessing the database. You then use a series of "write" routines to write information on successive procedures to the database. You use the usual "end" routine after you are through writing.

The foregoing description of the use of "read" and "write" handlers applies directly when the information of interest is stored on a per-procedure basis. In some cases information is stored on a per-file rather than a per-procedure basis. In these cases, a single "read" or "write" call suffices to retrieve/deposit all information about the source file from/into the database.

The names and prototypes for the various "begin", "read", "end", "create", "free", and "write" routines are documented in separate sections below, for each handler that is available. For descriptions of the data structures these handlers return, see the header files associated with the handlers (these are named in the sections on the handlers). Header and object files are available in the Aristotle distribution area. The distribution also contains sample programs that illustrate the use of the handler routines.

Handlers return data structures. In some cases, we also provide functions that facilitate access to data in those structures.

The following list overviews existing handlers. Subsequent sections describe each handler interface in detail.



2.1.1 Common Interface Layer Routines

Relevant files

    dbh-common.h    header file

Handler routine prototypes

    DB_FILE *dbh_com_open_file( char *program_name, char *extension )

         Given PROGRAM_NAME, postfix EXTENSION onto that name. Then call
         GLOBAL_ADD_DB_PREFIX to place the appropriate database path in
         front of the new file name (the one with the extension). Attempt
	 to open the file for reading.

         Returns a pointer to the database control block (DB_FILE) for the
         file on success. On failure, it returns NULL and sets the global
         database error code variable db_error_code to one of the values in
         dbh-common.h

    DB_FILE *dbh_com_create_file_block()

         Allocate space for a database file control block, and initialize
         all of the fields to dummy values.

         Returns pointer to new block on success, returns NULL on failure.

    int      dbh_com_get_next_line( DB_FILE *f_ptr, char *buffer, int max )

         Read the next non-blank, non-comment line from F_PTR into BUFFER.
         Do not read more than MAX characters.

         Returns 1 on success, 0 on failure and sets db_error_code
         to the appropriate error code.

    int      dbh_com_close_file( DB_FILE *f_ptr )

         Close a previously opened or created database file after flushing
         appropriate buffers.

         Returns 1 on success, 0 on failure and sets db_error_code to
         one of the values in dbh-common.h
 
    DB_FILE *dbh_com_create_file( char *program_name, char *extension )

         Create on open a database file for writing. Construct the filename
         by appending EXTENSION to PROGRAM_NAME, then call global_add_db_prefix
         to add the appropriate path information.

         Returns a pointer to the control block for the newly created data
         base file. On failure, it returns NULL and sets db_error_code to the
         appropriate value from dbh-common.h

    N_LIST  *dbh_com_add_to_nlist( N_LIST *old_nlist, int node_id, char *label)

         Given a pointer to an old neighbor list (OLD_NLIST), add
         a new node representing NODE_ID/LABEL. (add an edge to the
	 list.)

         Returns a pointer to the new neighbor list.

    N_LIST  *dbh_com_delete_from_nlist( N_LIST *old_nlist, int node_id)

         Given a pointer to an old neighbor list (OLD_NLIST),delete the 
         node representing NODE_ID. (delete an edge from the list.)

         Returns a pointer to the new neighbor list.

    DBH_PROC_NAMES *dbh_com_proc_list( DB_FILE *f_ptr )

         Given a pointer to ANY type of database file, read all of the 1
	 records and return a list of sorted procedure names in a DBH_PROC_NAMES
	 structure.

         Returns a pointer to the structure containing the names

    DBH_PROC_NAMES *dbh_com_proc_unsorted_list( DB_FILE *f_ptr )

         Given a pointer to ANY type of database file, read all of the 1
	 records and return an unsorted list of procedure names in a 
         DBH_PROC_NAMES structure.

         Returns a pointer to the structure containing the names

    DBH_PROC_NAMES *dbh_com_proc_frame_list( DB_FILE *f_ptr )

         Given a pointer to ANY type of database file, read all of the 1
	 records and return a list of procedure names in a DBH_PROC_NAMES
	 structure.

         Returns a pointer to the structure containing the names

    UNREACHABLE  *dbh_com_add_to_unreachable_list(UNREACHABLE *old_list,int node_id)

         This procedure adds a node to the unreachable node list 

         Return value is a pointer to the old list.

    int dbh_com_search_proc_list(char **names, int count, char *name)

         Given a called procedure name it searches the proc_list and returns the proc_id.

         Return value is an integer value, the id of the procedure.

    void dbh_com_free_proc_list(DBH_PROC_NAMES* proc_list)

         Free a pointer to the DBH_PROC_NAMES structure.

    void dbh_com_free_nlist( N_LIST *nlist )

         Frees memory from a N_LIST.

    int dbh_com_search_nlist( N_LIST *nlist, int node_id, char *label )

         Searches for a given node and label in an N_LIST

         Return Value is 1 if found, 0 otherwise.

    void dbh_com_sort_proc_names(char **names,int number_of_names)

         It sorts the array which contains the procedure names.

Notes on data structures

    The structure used to represent files is DB_FILE. DB_FILE contains the 
    following data fields:

        program_name      string containing the file name for the program
        db_file_ptr       a C FILE pointer
        mode              integer with meaning, 0=unopened 1=read 2=write

     The structure used to return the names of all of the procedures present in a 
     file is a DBH_PROC_NAMES struct. It contains the following fields:

        number_of_names   integer count of the names
        name_array        array of strings index beginning at 1

     The structure used for neighbor lists is an N_LIST. It contains the 
     following fields:

        node_id           integer node id
        edge_label        string edge label
        next              pointer to next neighbor

     Unreachable nodes are recorded in an UNREACHABLE struct. 

         node_id          integer id of this node
         next             pointer to next unreachable node

Example of a program that uses the handler

    ./src/cf_printer.c


2.1.2 Control Flow Information Handler Routines

Relevant files

    dbh_cf.h          header file

Handler routine prototypes

    DB_FILE *dbh_cf_begin (char *program_name)

        Begin accessing cf information on program "program_name."
        Returns NULL on failure, else returns pointer to DB_FILE
        structure that contains file information.

    int dbh_cf_read (DB_FILE *prog_file, char *procedure_name,
                     DBH_CF_INFO **cf_info_struct)

        Read control flow information from the database file pointed
        to by "prog_file," concerning the procedure named in 
        "procedure_name."
 
        Returns info in the struct pointed to by "*cf_info_struct."  The
        function returns a 1 on success, and a 0 on failure.

    int dbh_cf_end (DB_FILE *prog_file)

        Ends access to the file pointed to in "prog_file."
        Returns 1 on success, 0 on failure.

    DB_FILE *dbh_cf_create (char *program_name)

        Create a cf information file in the database for program
        "program_name." Returns NULL on failure, else returns pointer
        to DB_FILE structure that contains file information.

    int dbh_cf_write (DB_FILE *prog_file, char *procedure_name,
        DBH_CF_INFO *cf_info_struct)

        Write control flow information contained in the structure
        pointed to by "cf_info_struct" on the procedure named in
        "procedure_name" into the database file pointed to by "prog_file."
        Returns 1 on success, 0 on failure.

    int  dbh_cf_get_edge_list( DBH_CF_INFO *cf_info_struct, 
                               DBH_CF_EDGE_INFO **edge_list );

         Given a cf_info_struct, return a list edge_list of the edges in the 
         associated cfg, including their edge identifiers and edge labels.
         Returns 1 on success, 0 on failure.

    int dbh_cf_get_edge_number( DBH_CF_INFO *cf_info_struct, 
                                int sourcenode, int sinknode, char *label);

         Given a cf_info_struct, and a sourcenode, sinknode, and edge label,
         return the edge identifier for the edge labeled with that label
         that goes from sourcenode to sinknode.  Returns -1 on error.
        
    void dbh_cf_free ( DBH_CF_INFO *cf_info_struct)

        Frees storage allocated for "cf_info_struct."

Notes on data structures

    Control flow information is contained in DBH_CF_INFO structures.
    DBH_CF_INFO structures contain the following fields:

        highest_node_number  highest node number in cfg for the procedure

        number_of_nodes      number of nodes in cfg for the procedure

        number_of_edges      number of edges in cfg for the procedure

        root_node_id         node id of the root node in the cfg

        int is_reachable     are all cfg nodes statically reachable? (0 means yes.)

        cf_arr               dynamically allocated array of CF_DATA structs
                             (indexed by 1 through highest_node_number)

    Control flow information on individual nodes is contained in CF_DATA.
    CF_DATA structures contain the following fields:

        node_id              the integer id of the node (node_id is 
                             set to -1 for unused node numbers)

        cf_preds             an N_LIST of all predecessor nodes

        cf_succs             an N_LIST of all successor nodes

        e_list               an E_LIST of all edges

    Edge information is in DBH_CF_INFO is provided in an E_LIST that has the 
    following fields:

        node_id              integer id of a node

        edge_id              integer id of an edge whose source is node_id

        edge_label           label on this edge

    The dbh_cf_get_edge_list procedure returns edge info as follows:

        pred_node            integer id of node that is source of edge

        succ_node            integer id of node that is sink of edge

        edge_label           label on the edge


2.1.3 Definition/Use Information Handler Routines

Relevant files

    dbh_du.h          header file

Handler routine prototypes

    DB_FILE *dbh_du_begin (char *program_name)

        Begin accessing du information on program "program_name."
        Returns NULL on failure, else returns pointer to DB_FILE
        structure that contains file information.

    int dbh_du_read (DB_FILE *prog_file, char *procedure_name,
                     DBH_DU_INFO **du_info_struct)

        Read def/use information from the database file pointed to by 
        "prog_file," concerning the procedure named in "procedure_name."
        Return the info in the struct pointed to by "*du_info_struct."
        The function returns a 1 on success, and a 0 on failure.

    int dbh_du_end (DB_FILE *prog_file)

        Ends access to the file pointed to in "prog_file."
        Returns 1 on success, 0 on failure.

    DB_FILE *dbh_du_create (char *program_name)

        Create a du information file for program "program_name."
        Returns NULL on failure, else returns pointer to DB_FILE
        structure that contains file information.

    int dbh_du_write (DB_FILE *prog_file, char *procedure_name,
                      DBH_DU_INFO *du_info_struct)

        Write def/use information contained in the structure
        pointed to by "du_info_struct" on the procedure named in
        "procedure_name" into the database file pointed to in
        "prog_file."  Returns 1 on success, 0 on failure.

    void dbh_du_free ( DBH_DU_INFO *du_info_struct)

        Frees storage allocated for "du_info_struct."

    int dbh_du_has_use(unsigned int use_type, unsigned int bitstring);

        A function that is used to determine whether a particular variable
       at a particular node has a use of a particular type (see below).

    int dbh_du_has_def(unsigned int def_type, unsigned int bitstring);

        A function that is used to determine whether a particular variable
       at a particular node has a def of a particular type (see below).

Notes on data structures

    Def/use information is contained in DBH_DU_INFO structures.
    DBH_DU_INFO structures contain the following fields:

        highest_node_number    the highest node number in the cfg for the
                               procedure

        number_of_nodes        the number of nodes in the cfg for the
                               procedure

        du_dat                 a dynamically allocated array of DBH_DU_DATA                                structures (indexed by 1 through 
                               highest_node_number)

    Def/use information for specific nodes is contained in DBH_DU_DATA.
    DBH_DU_DATA structures contain the following fields:

        node_id                integer id of the node (node_id is set 
                               to -1 for unused node numbers)

        num_defs               number of definitions in the node

        def_list_id            dynamically allocated array of the id's 
                               of the definitions (Starting at 1 not 0!)

        def_list_occ_type      dynamically allocated array of integers,
                               which, interpreted as bitstrings, give
                               information on the types of defs
                               a given variable has at a given node

        num_uses               number of uses in the node

        use_list_id            dynamically allocated array of the
                               id's of the uses (Starting at 1 not 0!)

        use_list_occ_type      dynamically allocated array of integers,
                               which, interpreted as bitstrings, give
                               information on the types of uses a given
                               variable has at a given node

     Occurence types are explained in the Aristotle User's 
     Manual, under the description of the du table viewer.  The occurence 
     type field tracks, for each identifier occuring in the statement 
     associated with a, which of the variable occurence types that identifier 
     participates in in that statement.

     Given an occurence type field for a variable x at node n, to find out
     wheth er x has a particular type of defs and/or uses at n, use the
     dbh_du_has_use and dbh_du_has_def functions (available when you link 
     your program with dbh_du.o), passing in the "use_list_occ_type" 
     or "def_list_occ_type" integer for that variable at that node.


2.1.4 Symbol Table and Site Information Handler Routines

Relevant files

     dbh_sym.h         header file

Handler routine prototypes

     DB_FILE *dbh_sym_begin (char *program_name, 
                             DBH_SYM_TABLE **glob_info_struct)

          Begin accessing symbol table and call/entry/ai site information
          on source file "program_name."  Load all info on global 
          variables into "glob_info_struct." Returns 1 on success, 0  
          on failure.  NOTE: glob_info_struct is an array indexed by 
          the ABSOLUTE VALUE OF THE VARIABLE ID for all global variables,
          because global variables have negative variable ids.

     DB_FILE *dbh_sym_create (char *program_name)

          Create a sym file for program program_name. Returns NULL on
          failure, else returns pointer to DB_FILE structure that contains 
          file information.

     int dbh_sym_read (DB_FILE *prog_file, char *procedure_name,
                       DBH_SYM_TABLE ** sym_info_struct)

          Read symbol table and site information from the database,
          concerning the procedure named in procedure_name. Sym_info_struct
          is filled with symbol table information on variables OTHER THAN
          GLOBALS.  There are also lists of call, pva and exit sites.
          The function returns 1 on success, 0 on failure.

     int dbh_sym_write (DB_FILE *prog_file, char *procedure_name,
                        DBH_SYM_TABLE *sym_info_struct, LIST_HEADER
                        *call_sites, LIST_HEADER *ai_site, LIST_HEADER 
                        *exit_site)

          Write out info on a procedure's symbols etc.

     int dbh_sym_flush_glob (DB_FILE *prog_file,
                             DBH_SYM_TABLE *global_info_struct)

          Write out info on globals used in a source file.

     int dbh_sym_end (DB_FILE *prog_file)

          Ends access to the current source.  Returns 1 on success, 
          0 on failure.

     void dbh_sym_free_symb(SYM_TABLE *sym_info_struct)

          Frees storage allocated for "sym_info_struct."

     void dbh_sym_free_glob(SYM_TABLE *glob_info_struct)

          Frees storage allocated for "glob_info_struct."

Notes on data structures

     The symbol table is contained in DBH_SYM_TABLE.
     DBH_SYM_TABLE records contain the following fields:

          number_of_symbols    the number of symbols

          *sym_arr             dynamically allocated array of DBH_SYM_DAT

          number_of_params     the number of parameters

          *param_ids           dynamically allocated array of param id's

          highest_node_id      the highest node id in cdg for the procedure

          number_of_nodes      the number of nodes in cdg for the procedure

     Information on individual symbols is contained in DBH_SYM_DAT records.
     DBH_SYM_DAT records contain the following fields:

          var_id               unique id for each variable

          var_type             whether variable is a formal parameter, or 
                               local or global variable

                                   2 = formal parameter
                                   3 = local variable
                                   4 = global variable

          var_name             variable's name

          is_var_pointer       is the variable a pointer?

                                   1 = pointer
                                   0 = not a pointer

          is_var_static        is the variable declared as static?

                                   1 = static
                                   0 = not static

          is_var_extern        is the variable extern?

                                   1 = extern
                                   0 = not extern

    Information on each callsite is contained in DBH_CALL_NODE records.
    DBH_CALL_NODE records contain the following fields:

          node_id               node id at which call was made

          proc_name             name of called procedure

          rtn_node_id           node id of associated return node

          number_of_parameters  number of actual parameters

          *parm_array           dynamically allocated array of parameters

    Information on alias introduction nodes is contained in DBH_AI_NODE.
    Alias introduction nodes are nodes where aliases are created via
    statements of the form p = q where p and q are pointers, or p = &q
    where p is a pointer and q a scalar.  DBH_AI_NODE records contain
    the following fields:

          node_id               integer node id

          var_def_id            var id of variable bound (on left
                                of expression)

          var_use_id            var id of variable bound to var_def_id

    Information on exit nodes is contained in DBH_EXIT_NODE records.
    DBH_EXIT_NODE records contain the following fields:

          node_id               integer node id


2.1.5 Mapping Information Handler Routines

Relevant files

    dbh_map.h         header file

Handler routine prototypes

    DB_FILE dbh_map_begin (char *program_name)

        Begin accessing map information on program "program_name."
        Returns NULL on failure, else returns pointer to DB_FILE
        structure that contains file information.

    int dbh_map_read (DB_FILE *prog_file, char *procedure_name,
                      DBH_MAP_INFO **map_info_struct)

        Read mapping information (info, for each cdg/cfg node, on its
        label, corresponding C source statement, corresponding RTL code 
        statements, and type) on the next procedure from the database.
        proc_name contains the procedure name if one is found.
        map_info_struct is filled with mapping information.  The
        function returns 1 on success, 0 on failure.

        Note that cfg nodes are a subset of the cdg nodes for a
        procedure, comprised primarily of statement, conditional,
        and predicate nodes.  Thus a node n in the cdg, representing
        statement s in an analyzed procedure p has the same node_id
        in the cdg for p as in the cfg for p.

    int dbh_map_end (DB_FILE *prog_file)

        Ends access to the file pointed to in "prog_file."
        Returns 1 on success, 0 on failure.

    DB_FILE dbh_map_create (char *program_name)

        Creates a map information file for program "program_name."
        Returns NULL on failure, else returns pointer to DB_FILE
        structure cntaining file information.

    int dbh_map_write (DB_FILE *prog_file, char *procedure_name,
                       DBH_MAP_INFO *map_info_struct)

        Write map information contained in the structure pointed to
        by "map_info_struct" on the procedure named in "procedure_name"
        into the database file pointed to in "prog_file."  Returns
        1 on success, 0 on failure.

    void dbh_map_free ( DBH_MAP_INFO *map_info_struct)

        Frees storage allocated for "map_info_struct."

    int  dbh_map_pred_node( DBH_MAP_INFO *map_info_struct, int nodeid )

        Given map_info_struct and nodeid, returns 1 if node nodeid is
        a predicate node, else returns 0.

    int  dbh_map_stmt_node( DBH_MAP_INFO *map_info_struct, int nodeid )

        Given map_info_struct and nodeid, returns 1 if node nodeid is
        a statement node, else returns 0.

Notes on data structures

    Map information is contained in DBH_MAP_INFO structures.
    DBH_MAP_INFO structures contain the following fields:

        highest_node_number    number of highest node in cdg for the
                               procedure

        number_of_nodes        number of nodes in cdg for the procedure

        map_dat                dynamically allocated array of DBH_MAP_DATA
                               struc tures (indexed by 1 through 
                               highest_node_number)

    Map information on individual nodes is contained in DBH_MAP_DATA.
    DBH_MAP_DATA structures contain the following fields:

        node_id                integer id of the node (node_id is set 
                               to -1 for unused node numbers)

        node_label             string label of the node.  Labels are:

                                   R Region           P Predicate
                                   F Function call    E Entry,
                                   X Exit             C Condition
                                   S Statement        B Break
                                   G Goto             L Label
                                   W Switch

        node_type              integer node type - a generic typing
                               (symbolic constants used below are defined
                               in gccfe_types.h)

                                     REGION_NODE         41
                                     PREDICATE_NODE      42
                                     STATEMENT_NODE      43
                                     CALL_NODE           44
                                     ENTRY_NODE          45
                                     EXIT_NODE           46
                                     CONDITION_NODE      47
                                     JOIN_NODE           48
                                     DECL_NODE           49
                                     RETURN_NODE         50

        sub_type               integer sub-type - gets more specific about
                               nodes (symbolic constants used below are 
                               defined in gccfe_types.h)

                                     RETURN_STATEMENT        61
                                     EXIT_STATEMENT          62
                                     BREAK_STATEMENT         63
                                     CONTINUE_STATEMENT      64
                                     GOTO_STATEMENT          65
                                     OTHER_STATEMENT         66
                                     FOR_STATEMENT           67
                                     IF_PREDICATE            71
                                     WHILE_PREDICATE         72
                                     DO_WHILE_PREDICATE      73
                                     FOR_PREDICATE           74
                                     SWITCH_PREDICATE        75
                                     IF_CONDITION            76
                                     WHILE_CONDITION         77
                                     DO_WHILE_CONDITION      78
                                     FOR_CONDITION           79
                                     ENTRY_REGION            81
                                     IF_CLAUSE_REGION        82
                                     ELSE_CLAUSE_REGION      83
                                     LOOP_TRUE_REGION        84
                                     LOOP_FALSE_REGION       85
                                     WHILE_HEADER_REGION     86
                                     DO_WHILE_HEADER_REGION  87
                                     FOR_HEADER_REGION       88
                                     CASE_REGION             89
                                     SUMMARY_REGION          90
                                     DONT_CARE              -100     

                              (DONT_CARE is used if the node type has 
                               no defined subtypes */


        start_stmt_number      source code line number statement starts on

        end_stmt_number        source code line number statement ends on

        first_rtl_code_stmt    For internal Aristotle system use only.

        last_rtl_code_stmt     For internal Aristotle system use only.


2.1.6 Control Dependence Information Routines

Relevant files

        dbh_cd.h         header file

Handler routine prototypes

    DB_FILE dbh_cd_begin (char *program_name)

        Begin accessing cd information on program "program name."
        Returns NULL on failure, else returns pointer to DB_FILE
        structure that contains file information.

    int dbh_cd_read (DB_FILE *prog_file, char *procedure_name,
                     DBH_CD_INFO **cd_info_struct)

        Read control dependence information from the database file
        pointed to by "prog_file," concerning the procedure named in
        "procedure name," returning the info in the structure pointed
        to by "*cf_info_struct."  The function returns a 1 on success,
        and a 0 on failure.

    int dbh_cd_end (DB_FILE *prog_file)

        Ends access to the file pointed to in "prog_file."
        Returns 1 on success, 0 on failure.

    DB_FILE *dbh_cd_create (char *program_name)

        Creates a cd information file for program "progam_name."
        Returns NULL on failure, else returns pointer to DB_FILE
        structure that contains file information.

    int dbh_cd_write (DB_FILE *prog_file, char *procedure_name,
                      DBH_CD_INFO *cd_info_struct)

        Write control dependence information contained in the structure
        pointed to by "cd_info_struct," on the procedure named in
        "procedure_name," into the database file pointed to in
        "prog_file."  Returns 1 on success, 0 on failure.

    void dbh_cd_free ( DBH_CD_INFO *cd_info_struct )

        Frees storage allocated for cd_info_struct.

Notes on data structures

    Control dependence information is contained in DBH_CD_INFO structures.
    DBH_CD_INFO structures contain the following fields:

        highest_node_number    highest node number in cdg for the procedure

        number_of_nodes        number of nodes in cdg for the procedure

        cd_arr                 dynamically allocated array of CD_DATA
                               structures (indexed by 1 through 
                               highest_node_number)

        root_node_id           node id of the root node in the cdg


    Control dependence information on individual nodes is contained in
    CD_DATA.  CD_DATA structures contain the following fields:

        node_id                the integer id of the node (node_id is 
                               set to -1 for unused node numbers)

        cd_preds               an N_LIST of all cd-predecessor nodes

        cd_succs               an N_LIST of all cd-successor nodes
                               NOTE: cd-successors are listed in order
                               of their appearance in an "ordered" cdg.
                               (In an ordered cdg, dependence successors
                               appear in control flow order).


2.1.7 Data Dependence Information Routines

Relevant files

        dbh_dd.h         header file

Handler routine prototypes

    DB_FILE dbh_cd_begin (char *program_name)

        Begin accessing dd information on program "program name."
        Returns NULL on failure, else returns pointer to DB_FILE
        structure that contains file information.

    int dbh_dd_read (DB_FILE *prog_file, char *procedure_name,
                     DBH_DD_INFO **dd_info_struct)

        Read data dependence information from the database file
        pointed to by "prog_file," concerning the procedure named in
        "procedure name," returning the info in the structure pointed
        to by "*cf_info_struct."  The function returns a 1 on success,
        and a 0 on failure.

    int dbh_dd_end (DB_FILE *prog_file)

        Ends access to the file pointed to in "prog_file."
        Returns 1 on success, 0 on failure.

    DB_FILE *dbh_dd_create (char *program_name)

        Creates a dd information file for program "progam_name."
        Returns NULL on failure, else returns pointer to DB_FILE
        structure that contains file information.

    int dbh_dd_write (DB_FILE *prog_file, char *procedure_name,
                      DBH_DD_INFO *dd_info_struct)

        Write data dependence information contained in the structure
        pointed to by "dd_info_struct," on the procedure named in
        "procedure_name," into the database file pointed to in
        "prog_file."  Returns 1 on success, 0 on failure.

    void dbh_dd_free ( DBH_DD_INFO *dd_info_struct )

        Frees storage allocated for dd_info_struct.

Notes on data structures

    Data dependence information is contained in DBH_DD_INFO structures.
    DBH_DD_INFO structures contain the following fields:

        highest_node_number    highest node number in ddg for the procedure

        number_of_nodes        number of nodes in ddg for the procedure

        dd_arr                 dynamically allocated array of DD_DATA
                               structures (indexed by 1 through 
                               highest_node_number)

        root_node_id           node id of the root node in the dfg


    Control dependence information on individual nodes is contained in
    DD_DATA.  DD_DATA structures contain the following fields:

        node_id                the integer id of the node (node_id is 
                               set to -1 for unused node numbers)

        fd_preds               an N_LIST of all flow dependence 
                               predecessor nodes

        fd_succs               an N_LIST of all flow dependence 
                               successor nodes

        od_preds               an N_LIST of all output dependence
                               predecessor nodes

        od_succs               an N_LIST of all output dependence
                               successor nodes


2.1.8 Program Dependence Information Routines

Relevant files

        dbh_pdg.h         header file

Handler routine prototypes

    DB_FILE dbh_pdg_begin (char *program_name)

        Begin accessing pdg information on program "program name."
        Returns NULL on failure, else returns pointer to DB_FILE
        structure that contains file information.

    int dbh_pdg_read (DB_FILE *prog_file, char *procedure_name,
                     DBH_PDG_INFO **pdg_info_struct)

        Read program dependence graph information from the database file
        pointed to by "prog_file," concerning the procedure named in
        "procedure name," returning the info in the structure pointed
        to by "*cf_info_struct."  The function returns a 1 on success,
        and a 0 on failure.

    int dbh_pdg_end (DB_FILE *prog_file)

        Ends access to the file pointed to in "prog_file."
        Returns 1 on success, 0 on failure.

    DB_FILE *dbh_pdg_create (char *program_name)

        Creates a pdg information file for program "progam_name."
        Returns NULL on failure, else returns pointer to DB_FILE
        structure that contains file information.

    int dbh_pdg_write (DB_FILE *prog_file, char *procedure_name,
                      DBH_PDG_INFO *pdg_info_struct)

        Write program dependence graph information contained in the 
        structure pointed to by "pdg_info_struct," on the procedure named 
        in "procedure_name," into the database file pointed to in
        "prog_file."  Returns 1 on success, 0 on failure.

    void dbh_pdg_free ( DBH_PDG_INFO *pdg_info_struct )

        Frees storage allocated for pdg_info_struct.

Notes on data structures

    Program dependence graph information is contained in DBH_PDG_INFO 
    structures.  DBH_PDG_INFO structures contain the following fields:

        highest_node_number    highest node number in pdg for the procedure

        number_of_nodes        number of nodes in pdg for the procedure

        pdg_arr                dynamically allocated array of PDG_DATA
                               structures (indexed by 1 through 
                               highest_node_number)

        root_node_id           node id of the root node in the pdg


    Program dependence graph information on individual nodes is contained 
    in PDG_DATA.  PDG_DATA structures contain the following fields:

        node_id                the integer id of the node (node_id is 
                               set to -1 for unused node numbers)

        cd_preds               an N_LIST of all cd-predecessor nodes

        cd_succs               an N_LIST of all cd-successor nodes
                               NOTE: cd-successors are listed in order
                               of their appearance in an "ordered" cdg.
                               (In an ordered cdg, dependence successors
                               appear in control flow order).

        dd_preds               an N_LIST of all flow dependence
                               predecessor nodes

        dd_succs               an N_LIST of all flow dependence
                               successor nodes

        cf_preds             an N_LIST of all predecessor nodes

        cf_succs             an N_LIST of all successor nodes


2.1.9 Reaching Definitions Information Handler Routines

Relevant files

        dbh_rd.h          header file

Handler routine prototypes

    DB_FILE dbh_rd_begin (char *program_name)

        Begins accessing rd information on program "program_name."
        Returns NULL on failure, else returns pointer to DB_FILE
        structure that contains file information.

    int dbh_rd_read (DB_FILE *prog_file, char *procedure_name,
                     DBH_RD_INFO **rd_info_struct)

        Read reaching definitions information from the database file
        pointed to by "prog_file," concerning the procedure names in
        "procedure_name," returning the info in the struct pointed to
        by "*rd_info_struct."  The function returns a 1 on success,
        and a 0 on failure.

    int dbh_rd_end (DB_FILE *prog_file)

        Ends access to the file pointed to in "prog_file."
        Returns a 1 on success, 0 on failure.

    DB_FILE *dbh_rd_create (char *program_name)

        Create rd information file for the program "program_name."
        Returns NULL on failure, else returns pointer to DB_FILE
        structure that contains file information.

    int dbh_rd_write (DB_FILE *prog_file, char *procedure_name,
                      DBH_RD_INFO *rd_info_struct)

        Write reaching definition information contained in the
        structure pointed to by "rd_info_struct" on the procedure
        named in "procedure_name" into the database file pointed to
        in "prog_file."  Returns 1 on success, 0 on failure.

    void dbh_rd_free( DBH_RD_INFO *rd_info_struct)

        Frees storage allocated for rd_info_struct.

Notes on data structures

    DBH_RD_INFO

        highest_node_number    highest node number in cfg for procedure

        number_of_nodes        number of nodes in cfg for procedure

        rd_dat                 dynamically allocated array of
                               highest_node_number DBH_RD_DATA structures
                               (indexed by 1 through highest_node_number)

    DBH_RD_DATA

        node_id                integer node id (node_id is set to -1 for 
                               unused node numbers)

        num_defs               number of definitions reaching

        node_id_arr            dynamically allocated array of num_defs
                               integer node id's (Starting at 1 not 0)

        var_id_arr             dynamically allocated array of num_defs
                               integer variable id's (Starting at 1 not 0)


2.1.10 Preprocessed Source Handler Routines

The preprocessed source handler differs from other handlers in that it does not return information on a per-function basis: rather, it returns the preprocessed source for an entire source file on a single "src_read" call. Also, there is no "src_write" handler.

Relevant files

    dbh_src.h          header file

Handler routine prototypes

    DB_FILE dbh_src_begin (char *program_name)

        Begins accessing src information on program "program_name."
        Returns NULL on failure, else returns pointer to DB_FILE
        structure that contains file information.

    int dbh_src_read (DB_FILE *prog_file, DBH_SRC_INFO **src_info_struct)

        Read preprocessed source information from the database file
        pointed to by "prog_file," returning the info in the struct 
        pointed to by "*src_info_struct."  The function returns a 1 
        on success, and a 0 on failure.

    int dbh_src_end (DB_FILE *prog_file)

        Ends access to the file pointed to in "prog_file."
        Returns a 1 on success, 0 on failure.

    void dbh_src_free( DBH_SRC_INFO *src_info_struct)

        Frees storage allocated for src_info_struct.

Notes on data structures

    DBH_SRC_INFO

        number_of_lines    number of source lines listed

        src_arr            dynamically allocated array of number_of_lines
                           char* pointers (indexed by 1 through 
                           number_of_lines)


2.1.11 Dominator Information Handler Routines

Relevant files

    dbh_dom.h          header file

Handler routine prototypes

    DB_FILE *dbh_dom_begin (char *program_name)

        Begin accessing dom information on program "program_name."
        Returns NULL on failure, else returns pointer to DB_FILE
        structure that contains file information.

    int dbh_dom_read (DB_FILE *prog_file, char *procedure_name,
                      DBH_DOM_INFO **dom_info_struct)

        Read dominator information from the database file pointed to by 
        "prog_file," concerning the procedure named in "procedure_name."
 
        Returns info in the struct pointed to by "*dom_info_struct."  The
        function returns a 1 on success, and a 0 on failure.

    int dbh_dom_end (DB_FILE *prog_file)

        Ends access to the file pointed to in "prog_file."
        Returns 1 on success, 0 on failure.

    DB_FILE *dbh_dom_create (char *program_name)

        Create a dom information file in the database for program
        "program_name." Returns NULL on failure, else returns pointer
        to DB_FILE structure that contains file information.

    int dbh_dom_write (DB_FILE *prog_file, char *procedure_name,
        DBH_DOM_INFO *dom_info_struct)

        Write dominator information contained in the structure
        pointed to by "dom_info_struct" on the procedure named in
        "procedure_name" into the database file pointed to by "prog_file."
        Returns 1 on success, 0 on failure.

    void dbh_dom_free ( DBH_DOM_INFO *dom_info_struct)

        Frees storage allocated for "dom_info_struct."

Notes on data structures

    Dominator information is contained in DBH_DOM_INFO structures.
    DBH_DOM_INFO structures contain the following fields:

        highest_node_number  highest node number in tree for the procedure

        number_of_nodes      number of nodes in tree for the procedure

        dom_arr               dynamically allocated array of DOM_DATA structs
                             (indexed by 1 through highest_node_number)

        dtree_root_node_id   node id of the root node in the dom tree

        pdtree_root_node_id  node id of the root node in the postdom tree

    Dominator information on individual nodes is contained in DOM_DATA.
    DOM_DATA structures contain the following fields:

        node_id              the integer id of the node (node_id is 
                             set to -1 for unused node numbers)

        dominates            an N_LIST of node ids, that lists all nodes
                             that node_id dominates

        dominatedby          an N_LIST of node ids, that lists all nodes
                             that node_id is dominated by

        dtree_parent         an N_LIST of node ids (that contains 0 or
                             1 element) that lists the node (if any) that
                             is the parent of node_id in the dominator tree
 
        dtree_children       an N_LIST of node ids that lists the nodes 
                             that are children of node_id in the dominator
                             tree

        pdominates           an N_LIST of node ids, that lists all nodes
                             that node_id postdominates

        pdominatedby         an N_LIST of node ids, that lists all nodes
                             that node_id is postdominated by

        pdtree_parent        an N_LIST of node ids (that contains 0 or
                             1 element) that lists the node (if any) that
                             is the parent of node_id in the postdominator 
                             tree
 
        pdtree_children      an N_LIST of node ids that lists the nodes 
                             that are children of node_id in the 
                             postdominator tree



2.1.12 Test Trace Information Handler Routines

Relevant files

    dbh_tr.h          header file

Handler routine prototypes

    DB_FILE *dbh_tr_begin (char *tracefile)

        Begin accessing trace information on tracefile "tracefil."
        Returns NULL on failure, else returns pointer to DB_FILE
        structure that contains file information.

    int dbh_tr_read (DB_FILE *trac_efile, char *procedure_name,
                      DBH_TR_INFO **tr_info_struct)

        Read trace information from the trace file pointed to by
        "trace_file," concerning the procedure named in "procedure_name."
        Returns info in the struct pointed to by "*tr_info_struct."  The
        function returns a 1 on success, and a 0 on failure.

    int dbh_tr_end (DB_FILE *trace_file)

        Ends access to the file pointed to in "trace_file."
        Returns 1 on success, 0 on failure.

    void dbh_tr_free ( DBH_TR_INFO *tr_info_struct)

        Frees storage allocated for "tr_info_struct."

    void dbh_tr_set( DBH_TR_INFO *tr_info_struct, int object)

        Set the bit of the trace vector in "tr_info_struct"
        that says node or edge "object" was executed.

    int dbh_tr_query( DBH_TR_INFO *tr_info_struct, int object)

        See if the bit of the trace vector in "tr_info_struct"
        that says node or edge "object" was executed is set.
        Return 1 if it is, else 0.

    void dbh_tr_unset( DBH_TR_INFO *tr_info_struct, int object)

        Unset the bit of the trace vector in "tr_info_struct"
        that says node or edge "object" was executed.

    void dbh_tr_clear( DBH_TR_INFO *tr_info_struct )

        Clear all bits of the trace vector in "tr_info_struct"

    int dbh_tr_make_tr_struct( DBH_TR_INFO **tr_info_struct, int highestbit)

        Make a new "tr_info_struct" sufficient to trace objects from 0
        to highestbit.  Allocate the structure and clear all bits. 

Notes on data structures

    Trace information is contained in DBH_TR_INFO structures.
    DBH_TR_INFO structures contain the following fields:

        highestbit     highest id of an object whose execution is tracked.
                       The trace structure tracks objects 0...highestbit.

        char *tvec     A bit vector [0...highestbit] saying whether
                       object (node or branch) tvec[i] was executed.

    The following global variables are provided:

        int dbh_tr_numprocs   gives the number of procedures in the source
                              file whose trace is collected in this trace
                              file.  Set by the dbh_tr_begin call.  This
                              may be greater than the number of procedures that
                              actually were executed, since not all procedures
                              in the source file are executed on a given run.

        int dbh_tr_type       2 indicates branch trace, 1 indicates statement.
                              Set by the dbh_tr_begin call.



2.1.13 Test History Information Handler Routines

Relevant files

    dbh_th.h          header file

Handler routine prototypes

    DB_FILE *dbh_th_begin (char *test_hist)

        Begin accessing th information in file "test_hist".
        Returns NULL on failure, else returns pointer to DB_FILE
        structure that contains history information.

    int dbh_th_read (DB_FILE *th_file, char *procedure_name,
                      DBH_TH_INFO **th_info_struct)

        Read test history information from the file pointed to by
        "th_file," concerning the procedure named in "procedure_name."
        Returns info in the struct pointed to by "*th_info_struct."  The
        function returns a 1 on success, and a 0 on failure.

    int dbh_th_end (DB_FILE *th_file)

        Ends access to the file pointed to in "th_file."
        Returns 1 on success, 0 on failure.

    DB_FILE *dbh_th_create (char *test_hist)

        Create a test history file "test_hist."  Returns NULL on failure, 
        else returns pointer to DB_FILE structure that contains file information.

    int dbh_th_write (DB_FILE *th_file, char *procedure_name,
        DBH_TH_INFO *th_info_struct)

        Write test history information contained in the structure
        pointed to by "th_info_struct" on the procedure named in
        "procedure_name" into the file pointed to by "th_file."
        Returns 1 on success, 0 on failure.

    void dbh_th_free ( DBH_TH_INFO *th_info_struct)

        Frees storage allocated for "dom_info_struct."

    void dbh_th_set( DBH_TH_INFO *th_info_struct, int object, int testid)

        Set the bit of the trace vector for test "testid" in "th_info_struct"
        that says node or edge "object" was executed by that test.

    int dbh_th_query( DBH_TH_INFO *th_info_struct, int object, int testid)

        See if the bit of the trace vector for test "testid" in "th_info_struct"
        that says node or edge "object" was executed by that test is set.
        Return 1 if it is, else 0.

    void dbh_th_unset( DBH_TH_INFO *th_info_struct, int object, int testid)

        Unset the bit of the trace vector for test "testid" in "th_info_struct"
        that says node or edge "object" was executed by that test.

    void dbh_th_clear( DBH_TH_INFO *th_info_struct )

        Clear all bits of the test history structure in "th_info_struct"

    int dbh_th_make_th_struct( DBH_TH_INFO **th_info_struct, int maxobject, int maxtest)

        Make a new "th_info_struct" sufficient to trace objects from 0
        to highestbit, for tests from 0 to maxtest.  
        Allocate the structure and clear all bits.

     void dbh_th_copy( DBH_TH_INFO *th_info_struct_1, int loc1,
                       DBH_TH_INFO *th_info_struct_2, int loc2 );

        Copy the th info in struct_2 at entry loc2 to struct_1 at entry loc1.

     int dbh_th_equal( DBH_TH_INFO *th_info_struct_1, int loc1,
                       DBH_TH_INFO *th_info_struct_2, int loc2 );

        See if the th infos in struct_1 at entry loc1 and struct_2 at loc2
        are equal.  Return 1 if so, else 0.

     int dbh_th_union( DBH_TH_INFO *th_info_struct_1, int loc1,
                       DBH_TH_INFO *th_info_struct_2, int loc2 );

        Union the th info in struct_2 at entry loc2 into struct_1 at entry loc1.

Notes on data structures

    Test history information is contained in DBH_TH_INFO structures.
    DBH_TH_INFO structures contain the following fields:

        highest_object_number  highest object number traced for this procedure

        highest_test_id        highest test id traced for this procedure

        object_arr             dynamically allocated array of TH_DATA structs
                               (indexed by 0 through highest_test_id)

    Trace information on individual tests is contained in TH_DATA.
    TH_DATA structures contain the following fields:

        object_id            the integer id of the test (set to -1 if unused)
                             (should more properly be called "test_id")
        
        *tvec                A bit vector [0...highestbit] saying whether
                             object (node or branch) tvec[i] was executed.

    The following global variable is provided:

        int dbh_th_trace_type   2 indicates branch trace, 1 indicates statement.
                                Set by the dbh_th_begin call.



2.1.14 Call Graph Information Handler Routines

Relevant files

    dbh_cg.h          header file

Handler routine prototypes

    DB_FILE *dbh_cg_begin (char *program_name)

        Begin accessing icf information on program "program_name."
        Returns NULL on failure, else returns pointer to DB_FILE
        structure that contains file information.

    int dbh_cg_read (DB_FILE *prog_file, DBH_CG_INFO **cg_info_struct)

        Read call graph information from the database file pointed to by 
        "prog_file".
 
        Returns info in the struct pointed to by "*cg_info_struct."  The
        function returns a 1 on success, and a 0 on failure.

    int dbh_cg_end (DB_FILE *prog_file)

        Ends access to the file pointed to in "prog_file."
        Returns 1 on success, 0 on failure.

    DB_FILE *dbh_cg_create (char *program_name)

        Create a cg information file in the database for "program_name". 
        Returns NULL on failure, else returns pointer to DB_FILE structure 
        that contains file information.

    int dbh_cg_write (DB_FILE *prog_file,  DBH_CG_INFO *cg_info_struct)

        Write call graph information contained in the structure pointed 
        to by "cg_info_struct" into the database file pointed to by 
        "prog_file." Returns 1 on success, 0 on failure.

    void dbh_cg_free ( DBH_CG_INFO *cg_info_struct)

        Frees storage allocated for "cg_info_struct."

DBH_CG_INFO *dbh_cg_new_node (const char *proc_name, int node_id,
                              int defined, int number_of_edges);

        Constructor for DBH_CG_INFO objects.  Returns pointer to a
        DBH_CG_INFO, or NULL on error.

DBH_CG_INFO *dbh_cg_find_node (LIST_HEADER *cg, const char *proc_name,
                               int node_id);

        Looks up a call graph node corresponding to the search keys.
        Returns pointer to node, or NULL if not found.

void dbh_cg_delete_list (LIST_HEADER *cg);

         Destructor for call graph lists.

Notes on data structures

    Call graph information is contained in DBH_CG_INFO structures.
    DBH_CG_INFO structures contain the following fields:

        proc_name         name of the procedure associated with node

        node_id           integer id for this call graph node

        defined           whether proc_name is defined in this source file

        number_of_edges   length of edge_array

        edge_array        array of outbound edges, contains node ids of
                          nodes that correspond to called procs




3. Aristotle Tools

Aristotle program-analysis tools can be invoked independently of the Aristotle menus. All tools can be run from the command line; in fact, Aristotle menus simply invoke tools by issuing system calls that give the command line invocations for those tools. This section describes how the Aristotle tools are invoked.

3.1 Parser/Analyzers

The CFE Parser/Analyser gathers intraprocedural control flow and local data flow information about each procedure in a C program.

To run CFE, type:

    cfe  <filename> 

In this invocation, the command runs the cfe executable. The parameter is the name of the C source file you wish to analyze.

CFE creates .cf, .du, .sym, and .map files in the database, which contain control flow, def-use, symbol-table and source-to-graph mapping information, respectively. These files can be displayed using the table printer or graph viewer tools. CFE also creates .cil files in the database directory which are for internal use of Aristotle.

When you run the Aristotle parser/analyzer from the menus, the menu system places a copy of the source file in the database, where it is available for use by the other tools.

3.2 Dataflow Analyzers

The dataflow analyzer gathers information on the flow of data in a procedure. You can perform dataflow analysis by running two executables from the command line. First, you run the dataflow analysis front end, dafe_driver. Then, you run the intraprocedural dataflow analyzer itself. The next sections describe these executables. NOTE: you cannot invoke the dataflow analyzer on a source file until you have run dafe_driver on that file.

Dataflow Analysis Front End

Dafe_driver lets you tailor your dataflow analysis in the following two ways:

  1. The .du file placed in the database by CFE lists local definition and use information for the functions in a source file, and distinguishes the occurence types for variables that are defined and used in that file. See the Users Manual, Section 5, Table Type 7, for a description of the output of the du viewer, and an explanation of occurence types. Dafe_driver lets you specify the occurence types you wish to consider during dataflow analysis. For example, if you wish to analyze data flow for scalar variables only, you can specify this using dafe_driver.
  2. Typically, intraprocedural dataflow analysis assumes that for each variable v defined or used in function f, if v is visible outside f, then definitions of v reach entry and call sites in f, and uses of v are reachable from exit and call sites in f. We call such analysis conservative. Dafe_driver lets you specify whether or not conservative analysis will be performed.

To run dafe_driver on a source file you must first ensure that .cf and .du information for that source file are present in the Aristotle database. This is typically accomplished by running CFE on the source file. You then invoke dafe_driver with the command:

    dafe_driver <filename> <list of occurences> [FORCE|NOFORCE]

In this invocation, <filename> names the C source file you are analyzing. For <list of occurences>, substitute one or more of the following constants, separated by commas, with no intervening spaces: SV,IV,PV,ALL. For the third argument, enter either the constant FORCE or the constant NOFORCE. The third argument is optional: the default is NOFORCE. For example, if you wish to perform conservative dataflow analysis on file foo.c, for SV and IV variable occurences (the most typical dataflow analysis configuration), you type the command:

   dafe_driver foo.c SV,IV FORCE

On completion, dafe_driver writes a .dfi file to the database. This file serves as input to the dataflow analyzer.

Dataflow Analyzer

When you have run dafe_driver on your C source file, you may then perform dataflow analysis on that file.

Aristotle provides build_rdefs, which performs reaching definitions analysis on a source file. You run this analyzer by entering the command:

   build_rdefs <filename> 

In this invocation, <filename> names the C source file you are analyzing. The analyzer uses the .dfi information deposited in the database by dafe_driver to calculate reaching definitions, and deposits reaching definitions information into an .rd file in the database.

3.3 Graph Building Tools

Aristotle graph building tools can be invoked from the command line, or by direct function calls made from C programs. Tools for building control, data, program dependence, call, and interprocedural control flow graphs are available (control flow graphs are built by the parser/analyzer).

Control Dependence Graph Builders

Aristotle's control dependence graph builder, build_cdg, produces control dependence graphs for the C functions found in a C source file. To invoke build_cdg, you must first ensure that control flow graph and map information (the .cf and .map files created by CFE) for the source file you are analyzing are present in the database. You then run build_cdg by typing:

          build_cdg <program_name>

In this invocation, the parameter, <program_name> is the name of the source file that contains the C functions you are building control dependence graphs for.

Build_cdg creates a .cd file in the database, which can be accessed using the cd handler, or displayed using the table printer or graph viewer tools.

Data Dependence Graph Builder

Aristotle's data dependence graph builder, BuildDDG, produces data dependence graphs for C functions found in a C source file.

To invoke BuildDDG, you must first ensure that control flow graph, local information dataflow information, and map information (the .cf, .du, and .map files created by CFE) for the source file you are analyzing are present in the database. You must also ensure that reaching definitions information (the .rd file created by build_rdefs) is present in the database.

To invoke BuildDDG from the command line, type:

          build_ddg <program_name>

In this invocation, the parameter, <program_name>, is the name of the source file that contains the C functions you are building data dependence graphs for.

BuildDDG creates a .dd file in the database, which can be accessed using the dd handler, or displayed using the table printer or graph viewer tools.

Program Dependence Graph Builder

Aristotle's program dependence graph builder, build_pdg, produces program dependence graphs for C functions found in a C source file.

To invoke build_pdg, you must first ensure that control dependence graph information (the .cd file created by build_cdg) data dependence graph information (the .dd file created by build_ddg) and control flow information (the .cf file created by the parser/analyzer) for the source file you are analyzing are present in the database.

To invoke build_pdg from the command line, type:

          build_pdg <program_name>

In this invocation, the first (and only) parameter, <program_name>, is the name of the source file that contains the C functions you are building program dependence graphs for.

Build_pdg creates a .pdg file in the database, which can be accessed using the pdg handler, or displayed using the table printer or graph viewer tools.

(Post)Dominance Tree Builder

Aristotle's (Post)dominance tree builder, BuildDOM, produces dominator and postdominator trees for the control flow graphs that correspond to C functions found in a C source file.

To invoke BuildDOM, you must first ensure that control flow graph information (the .cf file created by CFE) for the source file you are analyzing is present in the database.

To invoke BuildDOM from the command line, type:

          build_dom <program_name>

In this invocation, the parameter, <program_name>, is the name of the source file that contains the C functions you are calculating (post)dominance information for.

BuildDOM creates a .dom file in the database, which can be accessed using the dom handler, or displayed using a table printer tool.

3.4 Table Viewing Tools

You can view Aristotle database information using table viewers.

To invoke table viewers, you must ensure that the database files required by those viewers are present in the database. You may then invoke a table viewer by typing:

     <viewer> <program-name> <output-filename>

In this invocation, <viewer> is a viewer name, <program-name> is the name of the source file whose analysis information you wish to view, and <output-filename> is an optional argument that names the file to which the viewer should write information. If you omit the last argument, output is directed to the screen.

The following table lists the various types of information available in tabular form, and for each type of information, the viewer used to view that information, and the tools that must be run before the information on which the viewer depends will be available.

information viewer dependencies
control flow graph cf_printer CFE
control dependence graph cd_printer CFE, BuildCDG
data dependence graph dd_printer CFE, DFA, BuildDDG
program dependence graph pdg_printer CFE, DFA, BuildDDG, BuildCDG, BuildPDG
source-code to node mapping map_printer CFE
symbol table information sym_printer CFE
local def-use information du_printer CFE
reaching defs information rd_printer CFE, DFA
source with line numbers src_printer (placement of source into database)
(post)dominator information dom_printer CFE, BuildDom
call graph cg_printer CFE, BuildCG

The User's Manual, Section 6, describes how to interpret table viewer output.

3.5 Graph Viewing

Aristotle uses the XVCG tool to display graphs. XVCG is an X11 software package for visualization of compiler graphs. The Aristotle distribution includes an XVCG binary for Solaris in the bin subdirectory. Since the bin directory should be in your path for running Aristotle, XVCG should be able to run without any further setup.

XVCG requires input files in a specific format which Aristotle provides through the visigraph module. You can run visigraph from the command line by typing.

  visigraph -n<source_file> [-p<selected_procedure>] -g<graph_type> [-o<options]

The <source_file> and <graph_type> options are required. There must not be a space following any of the flags. Source_file is the name of the program that was analyzed. Graph_type is a short abbreviation for the type of graph to display from the following list.

    Graph Type Abbreviation
    Control Flow Graph cfg
    Dominator Tree dt
    Post Dominator Tree pdt
    Control Dependence Graph cdg
    Data Ddependence Graph ddg
    Program Dependence Graph pdg
    Call Graph cg
The <selected_procedure> option restricts the graph display to the given procedure. Otherwise all procedures in the source code are displayed. The options after the -o flag control the labeling of the graph nodes according to the options in the following list.
    Option Option Letter
    Source in nodes s
    Line numbers l
    Variable names v
    Node types t
    Node numbers n

Visigraph creates an input file for XVCG in your database directory. Now you can display the graph of all procedures by typing

  xvcg $ARISTOTLE_DB_DIR/<source_file>.#.<graph_type>.vcg
If you created an input file for a single procedure, display the graph using
  xvcg $ARISTOTLE_DB_DIR/<source_file>.<selected_procedure>.<graph_type>.vcg
As an example, if we want to display the control flow graph for all procedures in the sample program named calc.c, with node numbers, we would use the following two commands:
  visigraph -ncalc.c -gcfg -on
  xvcg $ARISTOTLE_DB_DIR/calc.c.#.cfg.vcg
To display a dominator tree with line numbers and variable names for the procedure named push in calc.c, we would use:
  visigraph -ncalc.c -ppush -gdt -olv
  xvcg $ARISTOTLE_DB_DIR/calc.c.push.dt.vcg
XVCG is sensitive to other applications which may be using any shared X11 colormap. XVCG may complain, color nodes oddly, or refuse to run if another application is displaying lots of colors. The Netscape web browser usually collides with XVCG. If you exit the application and restart XVCG, the colors will straighten out.

XVCG is distributed under the terms of the GNU General Public License and is available by anonymous ftp at ftp.cs.uni-sb.de (134.96.254.254) in the directory /pub/graphics/vcg. There is a web site in Germany at the University of Saarland (http://www.cs.uni-sb.de:80/RW/users/sander/html/gsvcg1.html). More information, including source code and documentation, is available on the website. Please refer to the XVCG documentation for usage details.

3.6 Code Instrumentation and Test Trace Management

Instrumenting C code

The Aristotle Code Instrumenter generates instrumented C source code for a given source code file. At present, two versions of the instrumenter are available: the first, il-st, generates instrumented programs to collect statement coverage information, the second, il-bt, generates instrumented programs to collect branch coverage information.

You can invoke the instrumenter versions from the command line using the following commands:

    il-st-2  <source-filename>
    il-bt-2  <source-filename>

In this invocation, <source-filename> refers to a C program that you wish to instrument. The output will be in a file nameed <source-filename>.int.c in the current working directory. See the examples below. The commands take the named source file, and using analysis information found in the database create a new version of the source file that, when executed, reports statements or branches traversed during that execution.

Before you can run the instrumenters on a source file, you must run the source file through CFE.

Compiling and Linking instrumented C code

To instrument a program, the instrumenter places calls into the program that reference routines in Aristotle libraries. These routines create bit vectors that hold coverage information, set bits in these vectors when program components are executed, and write out coverage information when programs terminate. To compile and link an instrumented file you must ensure that the compiler can access the Aristotle header files and Aristotle libraries that are required.

Suppose you wish to compile and link file "foo.int.c", which is an instrumented version of the source file "foo.c". Use the following command to compile the file into object file "foo.inst.o":

   gcc -g -I<Aristotle>/headers -c -o foo.int.o foo.int.c

where <Aristotle> names the location of your Aristotle installation. The GNU gcc compiler is used for illustration. You should be able to use cc, also.

To link the object file "foo.int.o" that you just created, and output an executable file "foo.int.exe", issue one of the following commands:

  gcc -g -o foo.int.exe foo.int.o -L<Aristotle>/lib/ -lIPF_st -lglobalfunc 
  gcc -g -o foo.int.exe foo.int.o -L<Aristotle>/lib/ -lIPF_bt -lglobalfunc 

where again, <Aristotle> names the location of your Aristotle installation. The only difference between the two commands is the library specified by the "-l" option. The first command specifies the "IPF_st" library: use this when you're dealing with a file instrumented for statement coverage. The second command specifies the "IPF_bt" library: use this when you're dealing with a file instrumented for branch coverage.

Executing an instrumented program

After you have compiled and linked your instrumented executable, execute it just as you would execute a non-instrumented executable. Given an instrumented executable created by instrumenting source file "foo.c", when you run that executable, it writes a file named "foo.c.tr" to the database directory. Each time you run the instrumented executable, it writes a new version of this file, overwriting any version previously present in the database. Thus, to collect a number of coverage files, after running your executable you must save the file that was created in the database directory. One approach is to create a "test coverage database directory" and move or copy each trace file there. This will enable you to use the test history management and viewing tools described below.

Viewing coverage information output by instrumented programs

It should be noted that the term "trace file" is used loosely below to mean a file containing either branch coverage information or statement coverage information. They are not actually complete traces of execution.

You can view a single coverage file using Aristotle's trace viewer, as follows:

     tr_printer <trace-file-name> [<source-file>|"SHORT"] [<output-filename>]

In this invocation, <trace-file-name> is the name of the coverage file you wish to view, <source-file>, if specified, is the name of the source file whose coverage information you are viewing, and <output-filename> is an optional argument that names the file to which the viewer should write information.

Unlike other printers, the tr_printer does not assume the coverage file is in the database, and does not prepend the path to the database directory to the file name that you give; thus, you must give the full pathname (or relative pathname) to the file.

The second (optional) argument may be a source file name, or the string "SHORT". In the former case, the printer prints a table of statements or branches, indicating for each whether the statement or branch was marked "hit". To invoke the trace viewer in this fashion, you must ensure that the ".cf" and ".map" files that correspond to the instrumented source file are present in your database. If you specify "SHORT" (the default if you omit this parameter entirely) the printer simply types a list of object ids for objects that were hit.

If you omit the output filename, output is directed to the screen.

There are ".tr" file handler routines available, if you wish to access coverage files from a program. See Section 2.1 of this manual for details.

Building test histories

A single coverage file shows you, for one test, which objects (statements or branches) in a program were traversed by that test. More useful, typically, is information that lists, for each test in a test suite, the objects traversed by that test. We call this a test history, and provide utilities for creating and viewing test history information.

You create a test history for a collection of coverage files. To create the history you must first create a set of coverage files, as described above, by saving them to some location so they will not be overwritten by each test execution. Do this by running your tests on your instrumented executable, and placing the coverage files in some directory -- we'll refer to this directory as your "test coverage database directory". The names you give to your coverage files are important: read on to see why.

Run the test history builder by issuing the command:

     th_builder <test_dir> <test_history_file> <test_prog>

where <test_dir> is your test coverage database directory, <test_history_file> names the test history file you wish to create, and <test_prog> names the C source file that you instrumented.

The th_builder locates every ".tr" file in <test_dir> and creates a test history file that lists, for each test, the objects (statements or branches) that were traversed by that test.

To locate coverage files, the th_builder issues the UNIX command "ls -1 *.tr", piping its output through the command "sort -n", in <test_dir>, and constructs a history that accounts for every file thus listed, in the order in which it is listed. The th_builder assigns integer identifiers to each coverage file thus listed, beginning with 0. If you wish your tests to be assigned integer identifiers that you can easily map to their associated coverage information, you should name your coverage files "0.tr, 1.tr,...,n.tr". (Thus, you should consider assigning integer identifiers 0,1,...,n to your tests).

Viewing test histories

To view a test history, issue the command:

     th_printer <test_history_file> <prog_name> ["COUNT"|"LIST"] [<output_file>]

where <test_history_file> names your test history file, <prog_name> names the source code file that you instrumented, the third (optional) parameter contains the string "COUNT" or the string "LIST", and the fourth (optional) parameter names the output destination for the printer.

Unlike other printers, the th_printer does not assume the test history file is in the database, and does not prepend the path to the database directory to the coverage file name that you give; thus, you must give the full pathname (or relative pathname) to the history file.

The third (optional) argument determines the format of the data output by the viewer. A "COUNT" lists objects, the number of tests that traverse them, and the percentage of total tests that this number represents. A "LIST" lists objects, and the identifiers (these being the identifiers assigned to coverage files by the test history builder, by the procedure detailed above) of tests that traversed them. The default value of this parameter, if you omit it, is "LIST".

If you omit the output filename, output is directed to the screen.

There are test history file handler routines available, if you wish to access test history file information from a program. See Section 2.1 of this manual for details.

Building and Viewing Edge Test Histories

Aristotle can build an edge test history from an existing branch test history file if control flow information is available in the database.

To build an edge test history, use the command:

  et_builder <program_name> <input_history> <output_history>
where <input_history> is a previously built branch test history file. If no output_history file is specified, the output will be displayed on the terminal. The required control flow information must already exist in the Aristotle database, or the edge test history will fail.

The edge test history file can be viewed using the th_printer program, just like for a branch test history file.


Georgia Tech | College of Computing | Software Engineering | Aristotle Home
Updated November 14, 2005 by Jim Jones