CTEC1335/2008F Strings ======= A string is a sequence of characters (char values). Two types of strings are available in C++: C-style strings: --------------- a sequence of characters terminated by a zero byte (null terminator, ASCII value 0, char literal is '\0'). A C-style string constant: const char * const HELLOC = "Hello" ; which is stored in memory as: 4 bytes for the named constant (HELLOC), which is a pointer. All pointers on Intel x86, Windows XP, Microsoft C/C++ are 32 bits in size. HELLOC "Hello" +-----+ +-+-+-+-+-+--+ | | ------> |H|e|l|l|o|\0| +-----+ +-+-+-+-+-+--+ 6 bytes for the string literal ("Hello") which is stored with the added terminating null character. (These 6 bytes may be padded out to 8 bytes with two additional zero bytes for memory alignment purposes.) The two "const" keywords refer to the characters themselves and to the pointer. The characters in the constant cannot be changed and the location of the string that the constant "points at" cannot be changed. C-style String Variable: char name [ 80 ] ; // actually a character array // it can store 79 characters plus the terminating null C++ Strings ----------- The ANSI/ISO C++ Standard finally standardized a string class (type), which makes the storage and manipulation of strings really convenient for programmers. #include // for std::string class using namespace std ; C++-style string constant: const string HELLOCPP = "Hello" ; _or_ const string HELLOCPP2( "Hello" ) ; The storage of characters in a string object (constant or variable) is managed by the string class code itself, and, as a programmer, you do not have to know the details like you do a C-style string. This is called "encapsulation" or "information hiding." C++-style string variable: string name ; // initialized to an empty string, // with zero length _or_ string name( "no name" ) ; // initialized using a string literal Example program: --------------- #include // for cin, cout objects (istream and ostream classes) #include // for string class using namespace std ; int main( ) { string name ; cout << "Enter your name: " ; getline( cin, name ) ; cout << "Hello there, " << name << "!" << endl ; } The getline( ) function is used to read in a line of text from the keyboard, storing all characters typed in, into a string variable, until the user presses Enter. The declaration for getline( ) is similar to the following: istream & getline( istream & stream, string & buf, char terminator = '\n' ) ; The 'stream' argument can either be cin or a file input stream (ifstream variable). It is passed by reference because its information is updated as each character is read. The 'stream' is also returned by the function for "chaining" purposes. For example: if ( getline( cin, name ).good( ) ) // chain good( ) call { cout << "Hello there, " << name << "!" << endl ; } else { cout << "No name was entered." << endl ; } The 'buf' argument is a string variable. The characters read from the 'stream' are copied into the string. The string's sequence of characters and length are updated. The 'terminator' argument has a default value, which means that it is optional (hence the call with only 'cin' and 'name'). By default, getline( ) will stop reading characters when a newline character ('\n') is reached. Using cin, this character is generated by the user pressing the Enter key. Using >> instead ---------------- There is an extraction operator for strings. However, this operator stops reading characters when the first "whitespace" character is encountered: a space (' ', ASCII 32), tab ('\t', ASCII 9), newline ('\n', ASCII 10), carriage return ('\r', ASCII 13) or form feed ('\f', ASCII 12) character is encountered. This doesn't work to well for entering one's name, because you usually type a space in between your given name and surname. String Operations ----------------- A string "knows" its length: int length( ) const ; A string "knows" the position of each character: char at( int index ) const ; A string can be compared to another string: bool operator==( const string & s ) const ; bool operator<( const string & s ) const ; bool operator>( const string & s ) const ; bool operator<=( const string & s ) const ; bool operator>=( const string & s ) const ; bool operator!=( const string & s ) const ; string a ; string b ; . . . if ( a == b ) { cout << "Match" << endl ; } . . . A string can be joined ("concatenated") with another string to make a third string: string operator+( const string & s1, const string & s2 ) ; string c = a + b ; A string can be appended onto the end of another string: string & operator+=( const string & s ) ; string d ; . . . d += c ; Strings can be printed: ostream & operator<<( ostream & stream, const string & s ) ; cout << a << endl ; (The "cout << a" part, and all other symbolic operations, is actually implemented as a function call: operator<<( cout, a ) Basic Parser Structure ---------------------- string line ; cout << "some prompt: " ; getline( cin, line ) ; for ( int i = 0 ; i < line.length( ) ; i++ ) { char c = line.at( i ) ; // check the value of c here, etc. // . . . }