A database is an organized collection of data. Data is typically formatted into records (aka rows) and fields (aka columns). The format that we choose for our records and fields determines the file format that will be used when we save and retrieve our data to/from disk.
The easiest way to create a database is using simple ASCII text. Fields are separated by spaces or tabs, and records are separated by newline characters (so that one record is saved per line). C++ considers spaces, tabs and newlines to be whitespace, and the built-in iostream library extraction operators will generally ignore it.
We use an ofstream to write the file, and an ifstream to read it back in. We write the record count (number of records in the database) as the first line of the file, so that we know how many records to read in later.
We declared a data structure, team, that stores one database record. Each of the structure members represent one field in the record. We also define an insertion operator and an extraction operator to save and retrieve single records. We had looked at insertion operators previously; the general form for an extraction operator is:
istream & operator>>( istream &, type & );
where type is your data type. You must pass the data by reference, because we want the function to change its value to whatever it read from the stream. If we passed by value, the function could only change a copy of the data, not the original.
Like insertion operators, the iostream.h has several "built-in" extraction operators. For example,
istream & operator<<( istream &, char * ); istream & operator<<( istream &, char & ); istream & operator<<( istream &, int & ); istream & operator<<( istream &, double & );
Note that the string extraction operator just uses a pointer (*) instead of a reference (&). That's OK -- in reality, a reference is a pointer! Since a string is an array of characters, having a pointer to the start of that array allows us to change the entire array.
// Simple text database using streams
#include <fstream.h>
#include <string.h>
struct team
{
char * name;
int gp;
int wins;
int losses;
int ties;
int gf;
int ga;
int pts;
};
// Write a team to an ostream (cout, ofstream, etc.) in text form
ostream &
operator<<(ostream & ostr, const team & theTeam)
{
// write only the essential data, i.e., that which cannot
// be calculated
// in our database, spaces are field (column) separators,
// so a team name can't have spaces
ostr << theTeam.name << ' '
<< theTeam.wins << ' '
<< theTeam.losses << ' '
<< theTeam.ties << ' '
<< theTeam.gf << ' '
<< theTeam.ga;
return ostr;
}
// Read text-form team data into from an istream (cin, ifstream, etc.)
// into a team
istream &
operator>>(istream & istr, team & theTeam)
{
// assume no team name is longer than 19 letters
char tmpname[20];
istr >> tmpname;
// only keep the ones we need
theTeam.name = new char [strlen(tmpname) + 1];
strcpy(theTeam.name, tmpname);
// read the important numeric data
istr >> theTeam.wins
>> theTeam.losses
>> theTeam.ties
>> theTeam.gf
>> theTeam.ga;
// calculate the others on-the-fly
theTeam.gp = theTeam.wins + theTeam.losses + theTeam.ties;
theTeam.pts = theTeam.wins + theTeam.wins + theTeam.ties;
return istr;
}
int
main()
{
const int N_TEAMS = 6;
// data taken from the Hamilton Spectator Sports section,
// Monday, February 22, 1999
const team NHL[N_TEAMS] =
{
{ "Toronto", 56, 32, 20, 4, 181, 168, 68 },
{ "Montreal", 59, 23, 28, 8, 139, 154, 54 },
{ "Detroit", 59, 31, 23, 5, 175, 147, 67 },
{ "NewYork", 57, 23, 27, 7, 158, 159, 53 },
{ "Chicago", 59, 16, 32, 8, 131, 190, 40 },
{ "Boston", 56, 23, 24, 9, 142, 132, 55 }
};
const char * const FILENAME = "nhl.dat";
// Open an output stream to the file and overwrite (throw away) any
// previous contents
ofstream out(FILENAME, ios::out | ios::trunc);
if (!out)
{
cerr << "error saving " << '"' << FILENAME << '"' << endl;
}
else
{
out << N_TEAMS << endl; // write the number of records
for (int i = 0; i < N_TEAMS; i++)
{
out << NHL[i] << endl; // write each record,
// terminate with a newline
}
}
out.close(); // close the stream, so we can re-open it
// for read ...
ifstream in(FILENAME);
if (!in)
{
cerr << "error restoring " << '"' << FILENAME << '"' << endl;
}
else
{
// we don't know in advance how many records there will
// be, so we have to allocate the array size dynamically
team * teamArr;
int n_teams;
// read in number of teams (the array size)
in >> n_teams;
teamArr = new team [n_teams];
for (int i = 0; i < n_teams; i++)
{
in >> teamArr[i]; // read in each record
// write the team to cout, but we have to
// add the "games played" and "points" fields,
// because our insertion operator (<<) doesn't
// write them
cout << teamArr[i] << ' '
<< teamArr[i].gp << ' '
<< teamArr[i].pts << endl;
}
}
return 0;
}
6 Toronto 32 20 4 181 168 Montreal 23 28 8 139 154 Detroit 31 23 5 175 147 NewYork 23 27 7 158 159 Chicago 16 32 8 131 190 Boston 23 24 9 142 132
If we look at a hex dump of the database file, we see that each character is stored by its ASCII value, and that the field separators are spaces (ASCII 20H) and the row separators are carriage return (ASCII 0DH) and line feed (ASCII 0AH) characters (under Windows 95).
00000000: 36 0d 0a 54 6f 72 6f 6e 74 6f 20 33 32 20 32 30 6..Toronto 32 20 00000010: 20 34 20 31 38 31 20 31 36 38 0d 0a 4d 6f 6e 74 4 181 168..Mont 00000020: 72 65 61 6c 20 32 33 20 32 38 20 38 20 31 33 39 real 23 28 8 139 00000030: 20 31 35 34 0d 0a 44 65 74 72 6f 69 74 20 33 31 154..Detroit 31 00000040: 20 32 33 20 35 20 31 37 35 20 31 34 37 0d 0a 4e 23 5 175 147..N 00000050: 65 77 59 6f 72 6b 20 32 33 20 32 37 20 37 20 31 ewYork 23 27 7 1 00000060: 35 38 20 31 35 39 0d 0a 43 68 69 63 61 67 6f 20 58 159..Chicago 00000070: 31 36 20 33 32 20 38 20 31 33 31 20 31 39 30 0d 16 32 8 131 190. 00000080: 0a 42 6f 73 74 6f 6e 20 32 33 20 32 34 20 39 20 .Boston 23 24 9 00000090: 31 34 32 20 31 33 32 0d 0a 142 132.. 00000099
One limitation of our simple ASCII database is that string fields cannot contain spaces. Because spaces are used to separate fields, the program can't know whether a space is a field separator or part of the field! ... Unless we change the program.
Before each string field, if we write out the length of the string as an integer, the program would know how long a field is. So, by adding an extra field to our database, we can make it more flexible.
#include <fstream.h>
#include <iomanip.h>
#include <string.h>
struct teamStruct
{
char * name;
int wins;
int losses;
int ties;
int gf;
int ga;
int gp;
};
int
main()
{
const int N_TEAMS = 6;
// data taken from the Hamilton Spectator Sports section,
// Monday, February 22, 1999
const teamStruct NHL[N_TEAMS] =
{
{ "Toronto", 32, 20, 4, 181, 168, 68 },
{ "Montreal", 23, 28, 8, 139, 154, 54 },
{ "Detroit", 31, 23, 5, 175, 147, 67 },
{ "New York", 23, 27, 7, 158, 159, 53 },
{ "Chicago", 16, 32, 8, 131, 190, 40 },
{ "Boston", 23, 24, 9, 142, 132, 55 }
};
ofstream out("nhl.db");
out << N_TEAMS << endl;
for ( int i = 0 ; i < N_TEAMS ; i++ )
{
out << strlen( NHL[i].name ) << ' ' << NHL[i].name << ' '
<< NHL[i].wins << ' '
<< NHL[i].losses << ' '
<< NHL[i].ties << ' '
<< NHL[i].gf << ' '
<< NHL[i].ga << ' '
<< ( NHL[i].wins + NHL[i].losses + NHL[i].ties )
<< endl;
}
return 0;
}
6 7 Toronto 32 20 4 181 168 56 8 Montreal 23 28 8 139 154 59 7 Detroit 31 23 5 175 147 59 8 New York 23 27 7 158 159 57 7 Chicago 16 32 8 131 190 56 6 Boston 23 24 9 142 132 56
To read each record in such a database, we first read in the length of the string field. This not only tells us how many characters are in the field, but also the number of characters we need to dynamically allocate -- using new. So it's both flexible (in terms of format) and efficient (in terms of memory usage).
We then call ws to skip over the field separator (whitespace), get to read the exact number of characters, and ws again to skip over the next field separator. We can then read all of the other fields as before.
#include <fstream.h>
#include <iomanip.h>
struct teamStruct
{
char * name;
int wins;
int losses;
int ties;
int gf;
int ga;
int gp;
};
int
main()
{
teamStruct team;
int n, length;
ifstream in("nhl.db");
in >> n;
cout << n << endl;
for (int i = 0; i < n; i++)
{
in >> length;
cout << length << endl;
team.name = new char [length + 1];
if (length > 0)
{
in >> ws;
in.get(team.name, length + 1);
in >> ws;
}
team.name[length] = '\0';
in >> team.wins;
in >> team.ties;
in >> team.losses;
in >> team.gf;
in >> team.ga;
in >> team.gp;
cout << '"' << team.name << '"' << setw(3)
<< team.wins << setw(3) << team.ties << setw(3) << team.losses
<< setw(4)<< team.gf << setw(4)<< team.ga << setw(3)<< team.gp << endl;
delete [] team.name;
}
return 0;
}