Today's bug challenge is a design bug.

Came across some C code like this awhile back (this is pseudo code and has all "work" elements removed from it):

void AddID(char * pszObjID)

{

char szID[8];

//Pad the ID with spaces

strcpy(szID," ");

strcat(szID,pszObjID);

strcat(szID," ");

//Check for the ID in the list

if ((!strstr(szIDList,szID)) && ((strlen(szIDList)+8)<MAX_LIST_SIZE))

//Add it to the list

strcat(szIDList,szID);

//…

}

The point of the code is to track the items added to the collection by ID. Each item is an object of sorts and has an integer ID. If the item is already on the list, we don’t want to add it again. We need to track the IDs because we persist them to disk after all items have been added. The ID numbers start at 10,000 and range up to 30,000. Typically, the list consists of a few hundred to a few thousand entries.

The code uses a char array to store the IDs so that it can use strstr() to search the list before adding a new ID to it. Each ID is padded on the left and right with a space in order to simplify this check.

I discovered all this because this code lived in a DLL that my app used and was AVing. It turned out that a different value was used to size the ID list than was used to constrain the number of IDs that could be added to the list. In other words, there was a buffer overrun – a fairly simple, garden-variety one.

Can you identify the design bug here?  Can you think of a way to improve this design (keep your solution in C/C++)? What alternative way of storing the list can you come up with that would not only be more efficient in terms of memory, but also facilitate quick searches for IDs? What mechanism can we employ to avoid the possibility of a buffer overrun altogether?