Thursday, January 29, 2009

Project 2: String Length Counter in C

This weekend I spent some time exploring C strings, and comparing/contrasting them with strings in C++ after doing a project in which we were required to write a word length counter without using the built-in stringlen functions (oh noes!).

Strings in C++ are nothing like strings in C. I had a vague idea of this when I was doing the project, as I have used C++ strings in the past, and they don't require as much work as C strings. There is a very good reason for this: while C++ strings are very nice containers with nice built-in functions, C strings are, quite literally, nothing but a group of chars. This can be expressed either as an array or a pointer. The difference between the two being one is a pre-sized block of memory containing a line of chars, while the other is an address to a block of memory (not necessarily pre-sized) containing a line of chars. Array do decay into pointers when being passed as parameters and such, but the two are still fundamentally different.

A very interesting thing about pointers and arrays is that, despite the fact they are represented completely different in memory, they can still be treated the same within this context. If you declare two variables:

char array[] = "Kitty";
char *pointer = "Cat";


you can use the same notation to access the individual characters in both, ie:

array[1] to get i
pointer[1] to get a

In the first case, the compiler will start at the first character of array and move one in order to get the value. In the second case, the compiler will fetch the pointer value, add 1 to this value, and then finally go to this location to load the character.

This is what, for me, makes C so interesting. It is far more low-level than C++, and as such, the fact that you're accessing values in memory is far more transparent. The malloc command I used in my homework literally sets aside a block in memory of the size indicated (returning a pointer), and keeps that memory allocated until it is either deallocted by free or until the program ends.

C++ string containers are just special templates that allow you to do far less damage accidentally (though it is said you can pretty epically destroy the world if you do mess up). They do a lot more to automatically manage your memory for you. You can convert them into a c_str (which is actually a const *char, and is necessary for a few file input functions such as fstream), but for all intents and purposes they are their very own, very easy to use data structure.

Anyway, I hope to do more C programming this year as a way to improve my knowledge of pointers, and hopefully allow me to start writing some more heavy-duty projects such as small compilers and what-have-you. Let the C adventures begin!

No comments: