>> All right so just[inaudible]
the sets and maps a little bit,
let's look at the work
WordBench homework assignment.
This is going to be an
assignment that uses sets
to accomplish something [inaudible].
So on successful completion,
and by the way for midterms,
probably think of these objectives.
Define the concept of
associative container.
State the distinction between unimodal
and multimodal associative containers.
Give examples.
Describe use cases making
each type appropriate.
You need to think about that.
State the distinction between ordered
and unordered associative containers.
Give examples of each.
Describe use cases.
State the API for unimodal
ordered set, multimodal ordered set
and unimodal unordered set,
multimodal unordered set.
Don't try to say that too fast.
And describe the behavior and
state the runtime expectations
for these operations.
So background knowledge you
should be familiar with a set
and map and introductory chapters.
And then let's talk about what
you're really supposed to do here.
You must create a client WordBench,
and it's going to be a
client with a set API.
So it's going to just
around where that API,
the set and we've got two
examples that would work already.
It serves as a text analysis
application.
So let me explain more carefully.
wordbench.h, wordbench.cpp.,
the two code file deliverables,
a makefile and a log file.
Begin by copying files from homework.
So here's the way WordBench
is supposed to work.
You can read an arbitrary text
file, any ASCII file, on a command
and extract all of the
words in that file.
You got to maintain the unique
words, along with the frequency
of occurrence of each word, in a set.
So that set is going
to have to store what?
It's going to have to
store a pair consisting
of a word at a frequency count.
When you read words, you're
going to convert upper case
and lower case before we compare
them, so we don't want to store while
with a capital W and while with a
small w, it's two different words.
We're going to convert
a lower case first.
So while is a word, no matter whether
it occurs with capital or small W,
and it occurs twice, and either
one of those vowels you have,
it's stored, they're two in the set.
And it becomes a little bit
painful dealing with punctuation
and other weird things, and we have
a few rules about how to handle that,
but we're not going to obsess over it.
This assignment, the text is
kind of a normal story type file
with English prose in it, and
things will work out pretty well.
You can of course read any text
file, so you can read like programs,
and they're going to look kind
of messy because programs have
so many weird punctuation
but still it should work.
So that's the reading part.
And write an analysis of
its current stored words.
And that analysis is going to be ordered
listing of the unique words paired
with the counts of those
words in storage.
So when you read one file into
WordBench and then read another file
into WordBench, it keeps
the counts going.
If you read three files, you've got
three files worth of words stored
in WordBench, and when you write your
analysis out, it stores the file names
that have been read in, so the
first thing it's going to say, okay,
this file is x, y, and z,
and then here's the data.
You can also clear the
data out and start fresh.
So you can also just show a summary
which just shows a summary to screen.
Briefly let me tell you
how you implement that,
when you use a pair, string, number.
I can see I need to work on this,
and decide if I want to use private
or entry type, so let me
-- since entry type seems
to be winning the votes,
let's call it entry type.
So your private stuff is going to be
two typedefs, and then a third typedef
which you have to choose, the one
we're going to start off with is fsu:
:UOVector Predicatetype for entry type.
Your possible sets that you can base
that on are unimodal ordered vector,
multimodal ordered list, unimodal
ordered list, multimodal ordered vector,
and here's one we'll do
later in the semester, RBLLT.
So any of these are going to be
implementation of the set API,
and can be used as a set type.
So you pick one and then
Wordset is a set of that type,
and then there's a list that contains
all the current files that we need.
We only have two pieces
of data, your wordset
and your list of files that we need.
[ Background sounds ]
So let's see if we've
got some text files here.
Yeah, I've got data0, 1, 2, 3 and 4.
Just take a look at data3.
Data3, it's just got some
meaningless photos in it.
So I'm going to crank up WordBench, so
I'll hit Summary, and there's no files,
no words, and no vocabulary,
because I have read them,
so I need to read this data1.in and it
reads it and gives me a little summary.
I think s will show a summary and
what the summary is are the files
that have currently been read, the
number of words in all those files,
and the total number of distinct words.
So now I'm going to read
data2.in and read
[ Background sounds ]
Get data3 and show a summary.
So I've read now data1.in,
data2.in, data3.in,
total number of words I've read is
146 and there's 48 distinct words
so now I'm going to do -- set my menu
here like before, I'll do file x.x here.
And by the way, you can clear all
the data and show the summary's back
to nothing and get out of here,
and there is the [inaudible].
Notice it starts out
the listing of files,
and then it just gives me the words
and the frequency they occurred
in all the file [inaudible].
It's clearly just the beginning of an
analytical text, but this can tell you
if you're overusing a word,
for example, [inaudible].
So anyway, that's the
assignment, and the important thing
about the assignment is you're getting
to use an experiment with the set API,
so you're programming to the set API.
We already have UOVector and
the UOList and the wordsmith.
They are already in the library,
so you don't have to worry
about implementing those sets.
You just use them.
Your next homework assignment will be
essentially providing a new kind of set
to go with wordsmith project.
I wanted to show you
one other thing here.
This is just a functionality test of
sets in general, and this one happens
to be set up as an order, a UOVector
and the client in there is capital CHAR.
That's a client that's in your library.
I believe it's capital CHAR.
That's like characters, but they're
considered equal whether they're upper
case or lower case.
So little A and capital A
are considered equal types.
And this comes in handy to kind
of illustrate things about sets.
So what I'm going to do is insert --
I believe that's a 1 --
I'm going to insert an A.
[ Background sounds ]
Okay, I'm going to have to bale on this.
I can't remember how to work on it.