Array or Set?

Posted by Albert Gareev on Mar 25, 2008 | Categories: Notes

What’s it about?

In Math service functions library I have generic functions for both array and set data formats. Those, who kept track of my postings probably noticed a functional redundancy, and was wondering why I define all the same functions for sets as I just did for arrays. With this post I provide explanations of my approach.

Formal Definitions

In Wikipedia, both Array and Set are defined as “a collection of elements”.
Same for a written notation: they are presented as a comma-separated list of items.


Citating Wikipedia.

In computer science, an array data structure or simply array is a data structure consisting of a collection of elements (values or variables), each identified by one or more integer indices, stored so that the address of each element can be computed from its index tuple by a simple mathematical formula. […] Array structures are the computer analog of the mathematical concepts of vector, matrix, and tensor. Indeed, an array with one or two indices is often called a vector or matrix structure, respectively.

 As a Test Data, Array structure has the following Pros and Cons.


  • Native support in programming languages
  • Quick processing



  • Have to be defined and declared as a part of the base code
  • Array elements can be only data of a simple type: number, string
  • Require structured storage (i.e. comma-separated text file, Excel spreadsheet, Database, etc.)
  • Maintenance issues if structure is changing



Citating Wikipedia.

A set is a collection of distinct objects, considered as an object in its own right. Sets are one of the most fundamental concepts in mathematics. Developed at the end of the 19th century, set theory is now a ubiquitous part of mathematics, and can be used as a foundation from which nearly all of mathematics can be derived. […] The elements or members of a set can be anything: numbers, people, letters of the alphabet, other sets, and so on.

 As a Test Data, Set structure has the following Pros and Cons.


  • Set concept is native in human language
  • Set structure can be defined and declared at run-time
  • Sets can be re-structured dynamically
  • Test data in sets are easyly and conveniently maintainable



  • Lack of support in programming languages
  • Processing sets is machine resource consuming


The matter of choice

As we can see from the above, in test automation context, the choice is between convenience for people (sets) or convenience for machines (arrays).

The challenge

The main challenge for automation developers is to design and implement programming support for sets by utilizing existing technologies, and embed it into Data Model.

Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported
This work by Albert Gareev is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported.