Welcome, Guest. Please login or register.

Author Topic: My C homework  (Read 15959 times)

Description:

0 Members and 1 Guest are viewing this topic.

Offline Jupp3

  • Sr. Member
  • ****
  • Join Date: Mar 2002
  • Posts: 364
    • Show only replies by Jupp3
    • http://jupp3.amigafin.org
Re: Mel's C homework
« Reply #29 on: February 26, 2007, 01:23:26 PM »
Quote
if (whitespace == 1)

If 0 and 1 (non-zero) are the only different values you need, you can always just do if(whitespace) (which in this case pretty much equals if(whitespace=1)) - on many (most?) CPU's it's faster and more compact to check whether or not a value is zero, rather than checking if it equals some other value. Basically this is also true for if(a==0), but probably many compilers can optimize that to if(!a)

As you probably have already noticed, this is often used as a way of error checking. Consider result=malloc(size); - if the allocation failed, you get NULL pointer. Personally whenever possible, I also use the same approach in my code. Another variant is to use zero as success, and non-zero as error code.

Also don't forget to avoid doing for(i=0; i
 

Offline Karlos

  • Sockologist
  • Global Moderator
  • Hero Member
  • *****
  • Join Date: Nov 2002
  • Posts: 16879
  • Country: gb
  • Thanked: 5 times
    • Show only replies by Karlos
Re: Mel's C homework
« Reply #30 on: February 26, 2007, 01:26:08 PM »
@falemagn

Even though the stdio buffers the file, the overhead of repeatedly calling fgetc() can make a difference compared to iterating over an array of characters that you effectively have direct access to. You might notice the benefit on an 030 IDE system like Mel's.

Of course, this is only if you wanted to optimize the code as much as possible - but then, a good implementation might have an inlinable version of fgetc().
int p; // A
 

Offline Karlos

  • Sockologist
  • Global Moderator
  • Hero Member
  • *****
  • Join Date: Nov 2002
  • Posts: 16879
  • Country: gb
  • Thanked: 5 times
    • Show only replies by Karlos
Re: Mel's C homework
« Reply #31 on: February 26, 2007, 01:40:41 PM »
Quote

falemagn wrote:

One thing I would suggest, instead, is to use ctype.h's isspace() function (or macro, depending on its implementation).


Actually, I think the idea of differentiating "word" characters from "non-word" characters would be more effective. You'd probably want to use isalnum() for that one as it only returns non-zero for letters or numbers.
int p; // A
 

Offline falemagn

  • Sr. Member
  • ****
  • Join Date: May 2002
  • Posts: 269
    • Show only replies by falemagn
    • http://www.aros.org/
Re: Mel's C homework
« Reply #32 on: February 26, 2007, 02:14:18 PM »
Quote

Karlos wrote:
@falemagn

Even though the stdio buffers the file, the overhead of repeatedly calling fgetc() can make a difference compared to iterating over an array of characters that you effectively have direct access to. You might notice the benefit on an 030 IDE system like Mel's.


The standard allows getc() to be implemented as a macro, and it often is. Getc() is otherwise equivalent to fgetc().

 

Offline falemagn

  • Sr. Member
  • ****
  • Join Date: May 2002
  • Posts: 269
    • Show only replies by falemagn
    • http://www.aros.org/
Re: Mel's C homework
« Reply #33 on: February 26, 2007, 02:15:35 PM »
Quote

Karlos wrote:
Quote

falemagn wrote:

One thing I would suggest, instead, is to use ctype.h's isspace() function (or macro, depending on its implementation).


Actually, I think the idea of differentiating "word" characters from "non-word" characters would be more effective. You'd probably want to use isalnum() for that one as it only returns non-zero for letters or numbers.


I agree with you, but I was merely suggesting a way to do without the 'case' construct she used.
 

Offline Karlos

  • Sockologist
  • Global Moderator
  • Hero Member
  • *****
  • Join Date: Nov 2002
  • Posts: 16879
  • Country: gb
  • Thanked: 5 times
    • Show only replies by Karlos
Re: Mel's C homework
« Reply #34 on: February 26, 2007, 02:58:37 PM »
@falemagn

Quote
I was merely suggesting a way to do without the 'case' construct she used.


I agree that's the way forward but I also thought the case construct showed an exellent understanding of the underlying logic (for someone not acquainted with all the features of the c standard library) which is what I was looking for when I set the exercise :-)
int p; // A
 

Offline mel_zoomTopic starter

  • Full Member
  • ***
  • Join Date: Jan 2007
  • Posts: 231
    • Show only replies by mel_zoom
Re: Mel's C homework
« Reply #35 on: February 26, 2007, 08:48:43 PM »
I dont think this is as clean as it probably could be but it seems to work.

Code: [Select]

/*

  Simple word count program stage 2

  counts words in text file passed as argument
  presents summary of word lengths

*/

#include <stdio.h>
#include <ctype.h>

/* prototype */
int countwords(FILE* file, int* lengths);

/* main program */
int main(int argc, char** argv)
{
const char* filename = 0;
FILE* file = 0;
int numwords = 0;
int lengths[9] = {0};
int n;

/* get the filename from the command line*/
if (argc<2) {
puts(&quot;Usage: wordcount <file name>&quot;);
return 1;
}

/* try to open the file */
filename = argv[1];
file = fopen(filename, &quot;r&quot;);
if (file == NULL) {
printf(&quot;Error: couldn't open file '%s' for input\n&quot;, filename);
return 1;
}

/* count and display */
numwords = countwords(file, lengths);
printf(&quot;Counted a total of %d word(s) in file '%s'\n&quot;, numwords, filename);

printf(&quot;Words <  4 chars: %d\n&quot;, lengths[0]);
for (n=1; n<7; n++) {
printf(&quot;Words of %d chars: %d\n&quot;, n+3, lengths[n]);
}
printf(&quot;Words > 10 chars: %d\n&quot;, lengths[8]);

/* all done */
fclose(file);
return 0;
}

int countwords(FILE* file, int* lengths)
{
int numwords = 0;
int notinaword = 1;
int length = 0;
int character;

while( (character = fgetc(file)) != EOF ) {
if (isalnum(character)) {
/* character is in a word so increment the length */
if (notinaword) {
notinaword = 0;
}
length++;
}
else {
/* if we just got here then update our counters */
if (!notinaword) {
notinaword = 1;
numwords++;

/* fill appropriate length bucket */
if (length<4) {
/* for lengths < 4 fill bucket 0 */
lengths[0]++;
}
else if (length<11) {
/* for lengths 4...10 fill buckets 1-7*/
lengths[length-3]++;
} else {
/* for lengths > 10 fill bucket 8*/
lengths[8]++;
}
length = 0;
}
}
}
return numwords;
}


No switch-case either :-)
I love my MX5!
Please pay a visit
 

Offline falemagn

  • Sr. Member
  • ****
  • Join Date: May 2002
  • Posts: 269
    • Show only replies by falemagn
    • http://www.aros.org/
Re: Mel's C homework
« Reply #36 on: February 27, 2007, 04:59:50 PM »
Mel,

it looks good, but you don't really need the notinaword variable, you can recode the countwords function like this:

Code: [Select]

int countwords(FILE* file, int* lengths)
{
    int numwords = 0;
    int length   = 0;
    int character;

    while ((character = getc(file)) != EOF) {
        if (isalnum(character)) {
            /* character is in a word so increment the length */
            length++;
        }
        else
        if (length > 0) {
            /* if we just got here then update our counters */
            numwords++;

            /* fill appropriate length bucket */
            if (length < 4) {
                /* for lengths < 4 fill bucket 0 */
                lengths[0]++;
            }
            else
            if (length < 11) {
                /* for lengths 4...10 fill buckets 1-7*/
                lengths[length - 3]++;
            }
            else {
                /* for lengths > 10 fill bucket 8*/
                lengths[8]++;
            }
           
            length = 0;
        }
    }
   
    return numwords;
}


Notice also the use of getc instead of fgetc: it's likely to make the code faster, for the implementation can chose to implement it as an inline function or macro accessing the buffering data structures directly.
 

Offline mel_zoomTopic starter

  • Full Member
  • ***
  • Join Date: Jan 2007
  • Posts: 231
    • Show only replies by mel_zoom
Re: Mel's C homework
« Reply #37 on: February 27, 2007, 07:37:03 PM »
Hmm.

I hit a snag. It seems isalnum() only returns non zero for the characters 0...9 a...z A...Z

When the source text file contains words that might have some european characters it causes the program to think one word is several.

I think perhaps the logic should be that it looks for characters that are not space and are not punctuation?
I love my MX5!
Please pay a visit
 

Offline mel_zoomTopic starter

  • Full Member
  • ***
  • Join Date: Jan 2007
  • Posts: 231
    • Show only replies by mel_zoom
Re: Mel's C homework
« Reply #38 on: February 27, 2007, 07:44:37 PM »
Ok that didnt work either :-(

:idea:

Since I already know how to load a file character by character, how hard can it be to write a replacement isword()  that uses a table built by reading a different text file?

If there is a table with room for all 256 codes - all set to zero at first - you could load the "character set" file and fill in a 1 for every character found in that file.

Or perhaps the file only need contain extra characters not already recognised by filling the table with the isalnum() for each character first...
I love my MX5!
Please pay a visit
 

Offline Piru

  • \' union select name,pwd--
  • Hero Member
  • *****
  • Join Date: Aug 2002
  • Posts: 6946
    • Show only replies by Piru
    • http://www.iki.fi/sintonen/
Re: Mel's C homework
« Reply #39 on: February 27, 2007, 08:04:37 PM »
Quote
When the source text file contains words that might have some european characters it causes the program to think one word is several.

Assuming character set ISO 8859-1 (latin1), here is a routine that should work relatively well:
Code: [Select]

int isalpha_int(unsigned char c)
{
        c &= ~0x20;
        return (c >= 'A' && c <= 'Z') ||
               (c >= 192 && c <= 222 && c != 215);
}
 

Offline mel_zoomTopic starter

  • Full Member
  • ***
  • Join Date: Jan 2007
  • Posts: 231
    • Show only replies by mel_zoom
Re: Mel's C homework
« Reply #40 on: February 27, 2007, 08:52:38 PM »
Ok here is the final version for stage 2 - Im not planning to do anything more to it for now.

I improved it by moving more of the code from main() into functions and Ive made a custom isword() thats really a macro. It behaves exactly like isalnum() except that it uses a table that you can add characters to by specifying a character set file as the second parameter on the shell.

I jumped ahead of myself a bit using a macro for that but it seemed easy enough for this exercise.

I also redid the count loop the way falemagn suggested. His modification seems obvious in hindsight. I guess I have a while to go yet :-)

Code: [Select]

/*

  Simple word count program stage 2 revised

  counts words in text file passed as argument
  presents summary of word lengths

  uses custom character list for determining words

*/

#include <stdio.h>
#include <ctype.h>

/* data */
static char chartypetable[256] = {0};

/* prototypes */
void maketypetable(const char* charset);
int countwords(const char* filename, int* lengths);
void printsummary(const char* filename, int numwords, int* lengths);

#define isword(c) (chartypetable[(c)])


/* main program */
int main(int argc, char** argv)
{
const char* filename = 0;
const char* charsetfilename = 0;
int numwords = 0;
int lengths[9] = {0};

/* get the filename from the command line */
if (argc<2) {
puts(&quot;Usage: wordcount <file name> [character set]&quot;);
return 1;
}
filename = argv[1];

/* get any additional character set filename */
if (argc>2) {
charsetfilename = argv[2];
}

/* make our character table */
maketypetable(charsetfilename);

/* count and show summary */
numwords = countwords(filename, lengths);
printsummary(filename, numwords, lengths);

return 0;
}



void printsummary(const char* filename, int numwords, int* lengths)
{
int n;
printf(&quot;Counted a total of %d word(s) in file '%s'\n&quot;, numwords, filename);
printf(&quot;Words <  4 chars: %d\n&quot;, lengths[0]);
for (n=1; n<7; n++) {
printf(&quot;Words of %d chars: %d\n&quot;, n+3, lengths[n]);
}
printf(&quot;Words > 10 chars: %d\n&quot;, lengths[8]);
}



int countwords(const char* filename, int* lengths)
{
int numwords = 0;
int length = 0;
int character;
FILE* file;

file = fopen(filename, &quot;r&quot;);
if (!file) {
printf(&quot;Error: couldn't open file '%s' for input\n&quot;, filename);
exit(1);
}

while( (character = getc(file)) != EOF ) {
if (isword(character)) {
/* character is in a word so increment the length */
length++;
}
else {
/* if we just got here then update our counters */
if (length>0) {
numwords++;

/* fill a range bucket based on the word length*/
if (length<4) {
/* for lengths < 4 fill bucket 0 */
lengths[0]++;
}
else if (length<11) {
/* for lengths 4...10 fill buckets 1-7*/
lengths[length-3]++;
} else {
/* for lengths > 10 fill bucket 8*/
lengths[8]++;
}
length = 0;
}
}
}

fclose(file);
return numwords;
}



void maketypetable(const char* charset)
{
/*
make own character type table
*/
int c;

/* first fill with isalnum() */
for (c=0; c<256; c++) {
if (isalnum(c)) {
chartypetable[c]=1;
}
}

/* try to add any user defined characters too - excluding spaces */
if (charset) {
FILE* file = fopen(charset, &quot;r&quot;);
if (file) {
while ( (c = getc(file)) != EOF ) {
if (!isspace(c)) {
chartypetable[c] = 1;
}
}
fclose(file);

/* show the recognised &quot;isword()&quot; character set */
printf(&quot;Added character set definition from file '%s'\nThe following characters will be treat as valid within words:\n&quot;, charset);
for (c=0; c<256; c++) {
if (chartypetable[c]) {
printf(&quot;%c &quot;, c);
}
}
putchar('\n');
}
else {
printf(&quot;Couldn't open character set definition file '%s'\n&quot;, charset);
}
}
printf(&quot;\n&quot;);
}


I think thats enough for one evening!
I love my MX5!
Please pay a visit
 

Offline AmireX

  • Jr. Member
  • **
  • Join Date: May 2005
  • Posts: 72
    • Show only replies by AmireX
Re: Mel's C homework
« Reply #41 on: February 27, 2007, 09:51:19 PM »
Quote

mel_zoom wrote:
Code: [Select]

/* fill a range bucket based on the word length*/
if (length<4) {
/* for lengths < 4 fill bucket 0 */
lengths[0]++;
}
else if (length<11) {
/* for lengths 4...10 fill buckets 1-7*/
lengths[length-3]++;
} else {
/* for lengths > 10 fill bucket 8*/
lengths[8]++;




Not even faster, but more compact :-)
Code: [Select]

/* fill a range bucket based on the word length*/
                                length = min(max(length-3,0),8);
                                lengths[length]++;


with
Code: [Select]

#define min(a, b)  (((a) < (b)) ? (a) : (b))
#define max(a, b)  (((a) > (b)) ? (a) : (b))

Regards, ReX
 

Offline falemagn

  • Sr. Member
  • ****
  • Join Date: May 2002
  • Posts: 269
    • Show only replies by falemagn
    • http://www.aros.org/
Re: Mel's C homework
« Reply #42 on: February 28, 2007, 08:45:35 AM »
Quote

mel_zoom wrote:
Hmm.

I hit a snag. It seems isalnum() only returns non zero for the characters 0...9 a...z A...Z


That's because the locale is set to the default "C". Libc's setlocale() function will change that behaviour, provided the underlying implementation supports it. AmigaOS' locale.library provides equivalent functionalities, but then it's not ANSI C anymore.
 

Offline falemagn

  • Sr. Member
  • ****
  • Join Date: May 2002
  • Posts: 269
    • Show only replies by falemagn
    • http://www.aros.org/
Re: Mel's C homework
« Reply #43 on: February 28, 2007, 08:49:08 AM »
Quote

AmireX wrote:
Not even faster, but more compact :-)
Code: [Select]

    /* fill a range bucket based on the word length*/
    length = min(max(length-3,0),8);
    lengths[length]++;

with
Code: [Select]

    #define min(a, b)  (((a) < (b)) ? (a) : (b))
    #define max(a, b)  (((a) > (b)) ? (a) : (b))

Regards, ReX


If she's using gcc, then the following might be a better choice:

Code: [Select]

    /* fill appropriate length bucket */
    switch (length)
    {
        case 1 ... 3:  lengths[0]++;        break;
        case 4 ... 10: lengths[length-3]++; break;
        default:       lengths[8]++;
    }
 

Offline motorollin

  • Hero Member
  • *****
  • Join Date: Nov 2005
  • Posts: 8669
    • Show only replies by motorollin
Re: Mel's C homework
« Reply #44 from previous page: February 28, 2007, 08:56:40 AM »
Every time I see the subject line of this thread I think it says "Mel C's homework" :lol:

Girl Power!

--
moto
Code: [Select]
10  IT\'S THE FINAL COUNTDOWN
20  FOR C = 1 TO 2
30     DA-NA-NAAAA-NAAAA DA-NA-NA-NA-NAAAA
40     DA-NA-NAAAA-NAAAA DA-NA-NA-NA-NA-NA-NAAAAA
50  NEXT C
60  NA-NA-NAAAA
70  NA-NA NA-NA-NA-NA-NAAAA NAAA-NAAAAAAAAAAA
80  GOTO 10