Data Structures



Defining Data Structures

A data structure is a means of grouping a number of variables of different data types together into a single unit. Here is a very simple data structure:


    struct simple
    {
        int x;
        char ch;
    }; 

A definition of a data structure consists of the key word struct followed by a name for the data structure type, simple in this case. This is followed by the open brace (curly bracket) "{", then a list of all the variables to be enclosed within the data structure. To end the definition use the closing brace, "}", and a semicolon, ";".

This definition of the data structure does not allocate any memory, it only gives the compiler the "floor plan" of what one of these structures will look like with you declare one as in this line of code:

     struct simple S1; 


Using typedef: It is common practice among C programmers to use the special definition identifier typedef (short for type define) when defining a data structure. This allows you to create a new data type definition using the syntax:

 	
    typedef DataType NewTypeName; 


where DataType is the thing being defined and NewTypeName is the "alias" being defined. For example:

 	typedef int Integer;
 	typedef float Real; 


If you used these two typedefs you could then declare int and float variables using the Pascal names for these data types, thus:


 	Integer x; // Equivalent to int x 
 	Real f; // Equivalent to float f 
 

If you use typedef to define the simple data structure type it would look like this:


    typedef struct
    {
        int x;
        char ch;
    }simple; 

Now you can declare data structures of type simple without using the keyword struct:

    simple S1;

Declaring Structures In C++

In C++ the use of typedef is automatic. That is, when you define a data structure like this:


    struct simple
    {
        int x;
        char ch;
    }; 

It is automatically "typdefed" so you can then declare structures of this type without using the keyword struct. Thus:

    simple S1;


Accessing the Fields of Data Structures

The variables included in a data structure are referred to as fields. To access one of those fields you use the name of the data structure (the one you declare not the type name) and the name of the field, separated by a period:


    S1.x = 6;  /* Set the x field in the data structure S1 to 6 */
    S1.ch = 'a'; /* Set the ch field in the data structure S1 to 'a' */ 



Nested Data Structures and Arrays of Data Structures

It is also possible to include one data structure inside of another. If you first define a data structure, like simple above, then you can define another data structure that contains a simple data structure. Assume that the simple structure has been defined (Note: assuming C++ syntax, i.e. automatic typedef).


    struct Complex
    {
        char label[20];
        float av;
        simple sim1;
    }; 
 


Now declare a data structure of type Complex:
    Complex com1;


...and access the fields of com1 in this way:
    strcpy(com1.label, "Test Struct");
    com1.av = 2.5;
    com1.sim1.x = 12;
    com1.sim1.ch = 'z'; 
 
Note that to access the fields in sim1 inside com1 you use the name of the outer data structure, then the name of the field (sim1) to access the inner data structure, finally the name of the field inside the simple data structure. All are separated by periods.


Arrays of Data Structures

To declare an array of data structures use the same syntax you use to declare arrays of other data types. Again assume that simple has been defined.


    simple sims[5];


Now access the data structures in this array:

    sims[3].x = 32; /* Set the x field of the 4th data structure in the array to 32 */
    sims[3].ch = 'A' /* Set the ch field of the 4th data structure in the array to 'A' */
 




Pointers to Data Structures

You can declare pointers to data structures just as you can declare pointers to other data types. Again assume that simple has been defined.


    simple *sptr;	/* Declare a pointer to a simple data structure */
    simple sim1;	/* Declare a data structure of type simple */ 	
    sptr = &sim1;	/* Set sptr pointing to sim1 */ 
 

Now you can access the fields of sim1 using the pointer sptr, but you must use the pointer operator which consists of the dash, (-), and the greater than sign, (>).

    sptr->x = 64;	  /* Set the x field in sim1 to 64 */
    sptr->ch = 'Z'; /* Set the ch field in sim1 to 'Z' */
 

This also implies that I can call a function and pass it the address of a data structure and the function can then access, and change the fields of the data structure. For example:

    simple *sptr;
    simple sim1;
    sptr = &sim1;
    SomeFunction(sptr);   /* Call function and pass the address stored in sptr */
    SomeFunction(&sim1);  /* Call function and pass the address of sim1 */ 
 

The function might look something like this:

    void SomeFunction(simple *s)  /* Note: argument to the function is a pointer */
    {
        int i_var;
        char c_var;
        printf("Enter an int and a character\n");
        scanf("%d %c", &i_var, &c_var);
        s->x = i_var;	/* Set the x field of the structure to the value stored in i_var */
        s->ch = c_var;	/* Set the ch field of the structure to the value stored in c_var */
    }  
 

C++ adds the ability to create a "reference" function in which a reference (again, just the address of the structure) is passed to the function. If SomeFunction, above, were re-written as a reference function it would look like this (Note: the shift to all C++ syntax):

    void SomeFunction(simple& s)  // Note the address operator & after the type
    {
        int i_var;
        char c_var;
        cout << "Enter an int and a character\n";
        cin >> i_var >> c_var;
        s.x = i_var;	/* Set the x field of the structure to the value stored in i_var */
        s.ch = c_var;	/* Set the ch field of the structure to the value stored in c_var */
        }  
 

You can also define a data structure which contains a pointer to a data structure just like itself. In fact, all of the abstract data types, which you will study in this course depend on being able to do this. Example (using standard C syntax with typedef):

    typedef struct simpletype
    {
        int x;
        char ch;
        struct simpletype *next;
    } simple2; 


Pay careful attention to several features of this definition which relate to standard C syntax.
  1. The typedefed name of this structure type is simple2.
  2. In order to define a pointer to a data structure of this type inside the data structure you must have a name already defined, thus the use of the name simpletype in the heading.
  3. In defining the pointer you MUST use the keyword struct and the name from the header. You cannot use the name simple2 because at the point where the pointer is defined the compiler does not yet know the typedefed name.

Again, because C++ automatically typedefs structures you can use the simpler definition form like this:

    struct simple2
    {
        int x;
        char ch;
        simple2 *next;
    }; 




Dynamically Allocating Memory for Structures

In all of the abstract data types you will study in this course you do not know ahead of time how many data structures you will need in the abstract data type. Therefore you will need to dynamically allocate the memory for each data structure as it is needed. To do this you use the C function malloc() which means "memory allocation, or the C++ function new.

malloc()

This function takes one argument, the number of bytes to allocate, and returns the starting address of the block of memory. The prototype for this function l ooks like this:


     void *malloc(int size); 


This function returns a void pointer. This means it has no data type assigned to it. It is just a "raw" memory address. In order to use this block of memory as a data structure we must use a process called typecasting to convert this untyped void pointer into a pointer to our data structure type. Here is what a call to malloc() would look like to allocate a block of memory the size of a simple2. Note the use of the sizeof() function to get the number of bytes in a simple2 data structure. This saves having to count them.


     simple2 *spt;   /* Declare a pointer to a struct simple2 type */
    spt = (simple2 *)malloc(sizeof(simple2)); 


The code (simple2 *) is the typecast which converts the void pointer returned by malloc() into a simple2 pointer which can then be stored in the simple2 pointer spt. Now the fields of the data structure pointed to by spt can be access using pointer notation:

    spt->x = 32;
    spt->ch = 'a';
    spt->next = NULL; /* Set the pointer in the structure to NULL */ 



When you have finished using the data structure that was dynamically allocated you can "deallocate" it and return that memory to the OS for reuse. To "free" the memory call the function free() passing it the pointer to the memory to be "freed".

    free(spt); 


The Three Laws of Using the Keyword "struct" in C

  1. When you define a data structure you must use the keyword struct even if you are typedefing the structure.
  2. If you define a pointer within a data structure which will point to another structure of the same type you must use the keyword struct even if you are typedefing the structure.
  3. When you declare a data structure of a particular type you must use the keyword struct unless you typedefed the structure definition, in which case you only need the structure type name.


new

In C++ you can also dynamically allocate memory for a data structure using the new keyword. Note that new takes care of dynamically allocating memory and typecasting the memory address to the correct type:

    spt = new simple2();


When you finish with the data structure you can free it's memory by using the delete keyword:

    delete spt;