Characters and Strings
Recap
- Recall that a C-style string is a character array that ends with
the null character
- Character literals in single quotes
- string literals in double quotes
- "Hello World\n"
- Remember that the null-character is implicitly a part of any string
literal
- The name of an array acts as a pointer to the first element of an
array
- Therefore, C-style strings will be closely related to pointers of
type char (i.e. type (char *)), as the name of a
character array is a pointer to the first element
Characters: the ctype library
ctype.h is a library that contains useful character
handling functions. There are some conversion functions (which start with
"to") and a number of query functions (which start with "is")
- Conversion functions: These return the ascii value of a
character
- int toupper(int c) - returns the uppercase version of
c if it's a lowercase letter, otherwise returns c
as is
- int tolower(int c) - returns the lowercase version of
c if it's an uppercase letter, otherwise returns c
as is
- Query Functions: These all return true (non-zero) or false (0),
in answer to the question posed by the function's name. They all take in
the ascii value of a character as a parameter.
- int isdigit(int c) - decides whether the parameter is a
digit (0-9)
- int isalpha(int c) - decides whether the character is a
letter (a-z, A-Z)
- int isalnum(int c) - digit or a letter?
- int islower(int c) - lowercase digit? (a-z)
- int isupper(int c) - uppercase digit? (A-Z)
- int isxdigit(int c) - hex digit character? (0-9, a-f)
- int isspace(int c) - white space character?
- int iscntrl(int c) - control character?
- int ispunct(int c) - printing character other than space,
letter, digit?
- int isprint(int c) - printing character (including ' ')?
- int isgraph(int c) - printing character other than ' '
(space)?
String declarations
- We've seen how to declare a character array and initialize with a
string:
char name[20] = "Marvin Dipwart";
Note that this declaration creates an array called
name (of size 20), which can be modified.
- Another way to create a varible name for a string is to use just a
pointer:
char* greeting = "Hello";
However, this does NOT create an array in memory that can be
modified. Instead, this attaches a pointer to a fixed string, which is
typically stored in a "read only" segment of memory (cannot be changed).
So it's best to use const on this form of declaration:
const char* greeting = "Hello"; // better
- Note: It would be legal to modify the contents of name
above, but it would NOT be legal to modify the contents of
greeting:
name[1] = 'e'; // name is now "Mervin Dipwart"
greeting[1] = 'u'; // ILLEGAL!
- These two examples illustrate the above declarations (name
and greeting). Note that in both of them, output of the string
is done the same way, regardless of the type of declaration. But they
differ in the attempts to change the string contents:
- str1.c -- This one
uses the first declaration of greeting (non-const pointer).
Note that the attempt to change the contents will compile, but
execution results in a run-time error
- str2.c -- This one
uses the second declaration of greeting (const used
with the pointer). The attempt to change the target, therefore, will
not compile.
String I/O:
- Recall that in the special case of arrays of type char, which
are used to implement c-style strings, we can use these in printf
and scanf calls with the special format symbol %s:
char greeting[20] = "Hello, World";
printf("%s", greeting); // prints "Hello, World"
char lastname[20];
scanf("%s", lastname); // reads a string into the array 'lastname'
// adds the null character automatically
- Also remember the following:
- The usual address-of operator (&)is not needed in the
above scanf call, because lastname (the name of an
array) already acts as a pointer -- i.e. it is an address
- The scanf call above only reads up to the first white-space
character.
- These examples only apply to the special case of the
character array.
char lastname[20];
scanf("%s", lastname); // if the user types "van Buren", the
// lastname array now stores "van"
int list[5] = {1, 2, 3, 4, 5};
printf("%s", list); // NO NO NO! This doesn't print array
// contents. It's not a char array!
- Clearly, the above scanf example is only good for reading one
word at a time. What if we want to read in a whole sentence into a
string? Well, there are some other library functions worth knowing
about
- Example illustrating reading a
string with scanf
Character and String I/O functions
These functions all come from stdio.h
- int getchar() -- reads the next character from
standard input and returns it as an integer (ascii value)
- char* gets(char* s) -- reads characters into the array
s until newline or end-of-file character. Appends null-character
automatically
- This is a function that would allow input of a whole sentence, not
just one word. Stops at newline, not at all white space.
- int putchar(int c) -- prints one character
(c) to standard output
- int puts(const char* s) -- prints the string
s, followed by a newline
- These two functions are just like printf and scanf,
except that they both have one extra parameter at the beginning --
a string. Instead of printing to standard output (i.e. the console), they
print their data to the string s (or read from s):
- sprintf(char* s, const char* format, ... )
- sscanf(char* s, const char* format, ... )
- Here's an example illustrating
reading a string with gets(), and printing it with
puts()
The standard C string library:
The standard string library in C is called string.h. To
use it, we place the appropriate #include statement in a code file:
#include <string.h>
This string library contains many useful string manipulation functions.
A few of the more commonly used ones are mentioned here. (The
textbook contains more detail in chapter 8)
- strcpy - copies the contents of one string to another
- strcat - concatenates one string with another
- strcmp - compares two strings and determines their
lexicographical order
- strncpy, strncat, strncmp - these do the same as
the three listed above, but only up to the first n characters of the
string
- strlen - calculates the length of a string
- strtok -- string tokenization. Used to break up a string into
smaller ones
Here is an example of the prototype of one string function:
char* strcpy( char * s1, const char * s2 );
This function copies the contents of the second string (s2) into the
first (s1). Note that both parameters use Pass By Address, and the
object is that two strings are passed in by name. Since strings are
character arrays, we are really passing the address of each string (char
array) into the function, and not the entire string itself. Since
the addresses will be stored in the local parameters s1 and s2, the function
has access to the original strings. Note also that the second parameter
is declared const -- this ensures that the string being copied is not modified
by the function. The first string must be modified, because this
is where we are copying to.
The definitions of these functions are easily written with pointer and
array techniques that we have already seen.
This link shows some possible
ways of defining some of these functions.