Complex Data Boundaries

Posted by Albert Gareev on Nov 02, 2009 | Categories: Heuristics

Complex Data Boundaries: overflow or type mismatch?

Complex data types are created by composition of basic data types. Compositions of data of the same type are formed as arrays. Compositions of data of different type are formed as records.

There are internal (technical) rules defining how complex data types are managed. The rules are platform and programming language specific. 

Arrays and pointers

For the each dimension it has, an array has minimal and maximal index boundaries defined as numbers. In some programming languages array index begins with 0, in some other – with 1. That leads to lots of confuses based on:

  • First element should be accessed as Array(0),
  • Last element should be accessed as Array(Number_of_elements minus one) 

When reported to the screen indexes should be converted to 1-based notations. When an input is received it should be converted to 0-based index before addressing an array element. 

To overcome that issue programmers use various approaches. For example, I have seen the code below on a few projects and in different companies:

Months = Array (“Zerourary”, “January”, “February”, “March”, “April”, “May”, “June”, “July”, “August”, “September”, “October”, “November”, “December”)
“Zerourary” month is defined to occupy the first slot which has index 0. 

Arrays may occupy a static (defined) memory amount or dynamically allocated memory. Dynamically allocated memory is addressed by a pointer data type.

If a memory allocation operation fails, NULL (empty) value is assigned to a pointer. Any attempt to address memory under the NULL-pointer will cause a run-time error.

If memory is constantly being allocated and not released, a program may exhaust system resources and cause heap overflow error (and most likely will get crushed).

If memory was released then the pointer has an uncertain value (“dangling pointer”). If program code tries to use it (address memory for reading or writing) it will cause an error and may result in a crush of the program.


String is an array consisting of characters. Some programming languages reserve 256 elements for a string type and store the actual number of characters at index 0 element. So the string may have up to 255 characters total. Depending on the programming language and compiler directives, when the actual number of characters is exceeded the string is silently truncated or an automatic exception is thrown. 

Enumerations and lists

Enumerations and lists define possible values that could be taken. In addition to index boundaries there are could be additional rules as in the following examples:

  • an array must contain same or less number of items as in the enumeration (e.g. reporting week days in timesheet)
  • an array may not contain duplicated items from the enumeration (e.g. week days should not be duplicated in the example above)
  • some items may not be duplicated and some could be (e.g. one can have multiple brothers, sisters, sons, daughters but only one mother and father. To make it more complicated: some countries allow having multiple spouses) 


Records may represent closely related values, like year, month, and day for the date; or somewhat related only business-wise, for example last name and SIN. Often values stored in a record require transformation before output to the screen. These transformation rules are subject to change depending on regional settings and customization settings in the program. 

A code operating records is oriented to the specific format. Code modules that pass in and back data records should either follow the same convention with a format or have bridging functions that perform the conversion. That applies both to back-end and front-end functions. 

Testing the boundaries 

The approach 

While defects with these types of boundaries usually impose a high severity issues they can be relatively easy caught at very early stages of development. Implementation of automated Unit Tests is better investment here rather than implementation of automated functional tests.

However there are a few good functional test sets that allow quickly check the most common scenarios. Since complex data type boundaries are internal the main trick here is to make a code make a mistake. 


  • Strings. Although GUI may not allow user to input strings exceeding the boundaries tester may try to make an application to produce a string of an inappropriate length. For example, if full name is constructed by an application as a combination like “last name comma first name” based on the manual input of first and last name.
  • Arrays and enumerations. Tries to make a code assigning a wrong value work very well with combo boxes (leave black, try multi-select, try typing value not presented in the list), list boxes (select none, select all), radio groups (select none) and combinations of those, for example, try input two mothers. 

The challenges 

The challenges in identifying and proving a defect are similar to the ones described in data container boundary testing.

1. No immediate visible reaction

An application might accept an (invalid) input and still be alright. Causing it to output the value or use the value in some operations might help revealing a problem.

Boundary testing for a single value might require a whole scenario: post transaction – run processing – verify account balance.

 2. Internal mess-up

A single value might “travel” a pretty complicated path while being passed from one function to another inside the program. If somewhere within that chain the value won’t fit the memory cell the end-result will be wrong despite of the application accepted input. Or value could be properly stored in memory but will be messed up during saving into file or database record.

Test scenarios might be required to reveal those defects.

Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported
This work by Albert Gareev is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported.