Strings: C-strings vs. strings as objects
C-style strings
- Recall that a C-string is implemented as a null-terminated array of
type char
- No built-in string type in C. Must use character
arrays
- NOT every character array is a C-string. Only when terminated with
the null-character
- String literals in code ("Hello World", "John Smith") are implemented
as constant C-strings
Library features
We have some features in the standard C++ libraries available to help us
work more easily with C-style strings:
- The <cstring> library:
- Special features in <iostream>:
- Special built-in functions for I/O handling of C-style strings,
like the insertion and extraction operators, get(), getline(),
etc
char str1[40];
cout << str1; // insertion operator for strings
cin >> str1; // extraction, reads up to white space
cin.get(str1, 40, ','); // reads to delimeter (comma)
cin.getline(sr1, 40); // reads to delimiter (default delimiter
// is newline), discards delimiter
-
Characters, strings, I/O -- You can review these features at
this link, from the pre-req course
- Also for your reference. Sample implementations of:
The DOWN side of C-strings
C-strings are pretty efficient, since they are just character arrays.
If you need a fixed size string that works fast, this is a great option.
But the user of the c-string has the responsibility for calling the
library functions correctly. They are not foolproof!
Some potential downsides of c-strings:
- Fixed length, when declared as a normal C-style array
- This is fine for things like database fields, where a fixed size on
each field is necessary
- Not so flexible for storing just any old string.
- c-string name really just acts as a pointer, since it's the name of an
array
- As such, always must be passed in and out of functions by address --
i.e. with pointer parameter type or return type
- Must be careful of array boundaries. Overflow of c-string boundary
not automatically detected, including in library functions (since
you're passing most of them just the string name (i.e. pointer),
not size)
- Less intuitive function calls for common operations, like comparison
or copying:
// Consider: I want to assign "Hello World" to a string called greeting:
char greeting[40];
// can I now say the following? (Which would be pretty intuitive)
greeting = "Hello World"; // Legal? NO!
// Instead, I have to do it this way, with the strcpy function - (or write a loop):
strcpy(greeting, "Hello World");
// Comparison: This call doesn't compare contents, only pointers:
if (str1 == str2)
// For c-strings, use strcmp, or write your own loop:
if (strcmp(str1, str2) == 0) // then strings are the same
- To get a flexible sized string, a programmer would have to use dynamic
allocation, paying specific attention to the appropriate use of
new, delete, and any resizing needs
The Almighty Null Character?
One of the larger pitfalls... Library functions for dealing with c-strings
are usually based on the expectation of a null-character to stop the
loop that is processing the c-string's contents. Is this ideal?
- What prints here?
char vowels[5] = {'A', 'E', 'I', 'O', 'U'};
cout << vowels; // is this a c-string?
- Here's an attempt to copy one c-string to another
char greeting[25] = "Take me to your leader"; // length 22, capacity 24
char welcome[10] = "Hello"; // length 5, capacity 9
strcpy(welcome, greeting); // anything to worry about?
- How about these completely sensible-looking calls, with plenty of
available space?
(Note that this will work now on some compilers
with newer updated versions of the cstring library, but not on some
others).
char buffer[40] = "Dog"; // length 3, capacity 39
char word2[] = "food"; // length 4, capacity 4
strcat(buffer, word2); // buffer is now "Dogfood"
strcat(buffer, " breath"); // buffer is now "Dogfood breath"
strcat(buffer, buffer); // plenty of room for this, right?
Note: This works now with some compilers, but not on all.
- These calls all look okay to you? In that case, try them!
A string wish list
As we enter the fantasy realm where only ideal strings abide, we ask:
how should they behave?
Different developers might have different notions, but here are some
basic properties that it would be nice to have for strings:
Building a String class
We can make the fantasy a reality! Just build a class to create a new
string type, which incorporates any desired features.
- Use dynamic allocation and resizing to get flexible capacity
- Do all dynamic memory management inside the class, so the user
of string objects doesn't have to!
- Use operator overloads if more intuitive notations are desired
- insertion, extraction operators for easy I/O
- comparison operators for easy comparisons, sorting, etc.
- operator+ for concatenation, if desired
- Build copy constructor and assignment operator to for correct
assignment and pass-by-value abilities
- Build a conversion constructor to convert c-style strings to string
objects, for automatic type conversions!
- Could include conversion constructors for converting other
types to strings, too
An example String class
We will build a very basic start of this in class.
(Every part of this code that we build in class will be posted here as
a downloadable link)
Here's a link to the start of a string class -- partial only (in
progress). This reflects the code we did from scratch in class.
As an exercise, try and fill in the other functions that are not yet
defined here. Also add test calls into the driver program to test the
other features.