I'll always remember my first programming class (Beginner C++). The professor started out on the first day of class telling us that the course would only cover numeric data manipulation and no strings because they were too complicated. I did well in the class, but I felt that something was missing. I could create really cool applications that played with numbers and outputted strings, but I never played with strings. As you could probably imagine, as I wanted to do more and more in my programs, I came to the point where I needed to do string manipulations. So I learned how to do it...
Basics - The Guts of Strings
When we talk about a string in C, what we are really referring to is a NULL terminated array of characters. In the standard C programming language, there is no actual type called a string. A string can be defined in two ways:
where s is the name of our string.
The first definition automatically allocates (reserves) the memory needed to store a maximum of 127 characters and a NULL (0x00). Of course, we could opt to put any number of characters less than 127 characters followed by a NULL into our string instead. To look at a particular character in the string, we would use indexing, just as we would access an integer in an array of integers. s would give us the first character in the string. If s==0x00, then our string has a length of 0.
The second definition is a character pointer. Defining a character pointer does not actually put any of the computer's memory aside for our string. Before we can use this method, we have to make sure that the pointer is pointing to memory that it is safe for us to read/write. If we already have another string, t, and we want to read the contents of t after the 5th character, we could say:
s = t+5;
In order to write to a character pointer, we must first allocate memory as mentioned earlier. The function used to do this is void* malloc(size_t size). "malloc" is short for "memory allocate".
s = (char*)malloc((int)SIZE);
We can decide to allocate any size we wish to by simply changing the value of SIZE. In order to free up the allocated memory, the function (void)free(void *ptr) is used. This allows the memory we were using for our string to be "recycled". Once we have freed our memory, there is no guarantee that it is safe to access it - especially for writing. More about memory allocation can be found at the C Memory Management article.
Standard String Functions - String Manipulation
The standard library for C string functions is string.h. This header file defines tons of useful string manipulation functions including, but not limited to:
- int strcmp(char* s1, char* s2)
- char* strcat(char* dest, char* src)
- char* strcpy(char* dest, char* src)
- int strlen(const char* s)
- char* index(const char* s, int c)
- char* strstr(const char* str, const char* substr)
Many other string functions exist, but using these basic string functions, most - if not all - basic string manipulations can be done. strcmp() is used to compare strings, strcat() is used to concatenate (combine) two strings, strcpy() copies a string, strlen() tells us the length of a string, strstr() searches for a sub-string in a string, and index() points to the first occurrence of a character in a string. Of course, any of these functions could fairly easily be done by hand, but why re-invent the wheel? Information on the string library can be found by simply typing man string.
Strcmp compares two strings and returns an integer indicating the difference between the strings. If the strings match, then the number returned is 0.
Strcat combines two strings and returns a pointer to the destination string. In order for this function to work (and not seg fault), you must have enough room in the destination for both strings.
Strcpy copies one string to another. The destination must be large enough to accept the contents of the source string.
Strlen returns the length of a string.
Index returns a pointer to the first instance of a character in a string. If the character cannot be found in the string, a NULL pointer is returned.
Strstr returns a pointer to the first instance of a particular string inside a string. If the sub-string can not be found, a NULL pointer is returned.
The above mentioned string functions could also have variations on how they work. For example, rindex does the same thing as index, but from the end of the string. Strncmp compares the first n characters of two strings, and strcasencmp does a comparison without comparing case.
Now what do I do? - Conclusions
Using these simple string functions, you can do anything you want with strings. The very engine I wrote to display this page uses only these simple string functions for all parsing of input and files, formatting output, etc.
That's all well and good, but my specific question wasn't answered...
If you came to this site looking for an answer to a specific question and did not find the answer, feel free to ask me at my webmaster address. I will try to help you, and may even set up a page of questions/answers for other readers.