Amiga.org
Amiga computer related discussion => Amiga/MorphOS/AROS Programmers Forum => Topic started by: asrael22 on May 25, 2017, 07:29:25 PM
-
Hi.
I've written a small cmd line utility on MorphOS.
It deals quite a lot with strings, strcpy, strcat, sprintf, etc.
I compile it with GCC on MorphOS. It runs fine there. Not only on quick tests, but running it for days.
On AmigaOS 3.9 however, the same sources, compiled with VBCC crash on a couple of places.
I could partly narrow it down to allocating dynamic memory instead of using stack data. But I'm not sure if it's that.
I've tried to create isolated tests for specific units. But I can't really figure out what's wrong.
Are there any debugging tools on AmigaOs I could give a try?
Are there any best practices for using stack memory vs. dynamic allocations (say for size of 1 to 32 kB)?
What about sprintf et al., it it save to use?
Manfred
-
Are there any debugging tools on AmigaOs I could give a try?
MuForce for NULL-pointer references. MuGuardianAngel for out-of-bounds buffer accesses, PatchWork to detect illegal system calls, SegTracker to hunt down where the crashes are. And a serial terminal connected with a null-modem cable, configured at 9600-8-N-1. Or Sashimi to get the "hits" on the terminal.
Are there any best practices for using stack memory vs. dynamic allocations (say for size of 1 to 32 kB)?
An AmigaOs binary typically has only 4K of stack. Thus, whatever object requires a substantial amount of stack, put it on the heap. As a rule of thumb: If it's longer than 256 bytes, allocate it.
What about sprintf et al., it it save to use?
Yes, why shouldn't it? If it is C, I would suggest snprintf, however, as it avoids buffer overruns.
-
I compile it with GCC on MorphOS. It runs fine there. Not only on quick tests, but running it for days.
Are you sure it still doesn't do something illegal, even though it doesn't crash? I'd check it with the Wipeout program to see if it trashes memory or so.
I'd check it like this:
1) Launch SDK:Tools/logtool
2) Run SDK:Tools/Wipeout
3) Test your program and see if you get anything suspicious to the logtool window
-
An AmigaOs binary typically has only 4K of stack. Thus, whatever object requires a substantial amount of stack, put it on the heap. As a rule of thumb: If it's longer than 256 bytes, allocate it.
OK. What happens if there is not enough stack space available?
I was reading that memory management on AmigaOS doesn't defragment memory and thought that allocations, if there are many over time, will heavily fragment memory. But for small allocations, a few kB there should always be continuous memory.
Manfred
-
Are you sure it still doesn't do something illegal, even though it doesn't crash? I'd check it with the Wipeout program to see if it trashes memory or so.
I'd check it like this:
1) Launch SDK:Tools/logtool
2) Run SDK:Tools/Wipeout
3) Test your program and see if you get anything suspicious to the logtool window
Thanks, I'll try those tools.
Manfred
-
OK. What happens if there is not enough stack space available?
The stack fill float into some other object of some other program. That is, everything is possible, from silent data corruption to immediate crashes.
I was reading that memory management on AmigaOS doesn't defragment memory and thought that allocations, if there are many over time, will heavily fragment memory.
Yes. However, for generic C programs, the malloc() implementation of the standard C library will typically use some sort of memory pools to minimize the effect. If you do not want to use malloc(), consider using the memory pools exec provides.
-
Yes. However, for generic C programs, the malloc() implementation of the standard C library will typically use some sort of memory pools to minimize the effect. If you do not want to use malloc(), consider using the memory pools exec provides.
OK. I'm using AllocVec/FreeVec. So I should be using either AllocPooled or malloc (because it does it already).
Thanks,
Manfred
-
OK. What happens if there is not enough stack space available?
Neither Workbench nor the shell will warn you if the program/shell command which you created would use too much stack space. Anything can happen.
Overrunning the alloted stack space produces errors which are extremely hard to diagnose because there may be no directly visible effects. Your program may seem to work correctly, in spite of having had too little stack space to work with.
The side-effects of overwriting the contents of memory which your program did not allocate can be baffling, and you wouldn't know from the effect what exactly the cause was. The next time you start the compiler, it might crash (huh?). Your text editor might hang whilst trying to save a file to disk. The pull-down menus of your text editor may no longer show up. Your system might just crash and hang.
Some of the worst side-effects to diagnose result from trashing the local stack of a function, which includes local variables and the return address of the function. The local variables contents may be damaged, which can result in pointers referencing illegal addresses, for example. If the stack is trashed so far that the function's return address is overwritten, then that function will crash upon returning.
-
OK. I'm using AllocVec/FreeVec. So I should be using either AllocPooled or malloc (because it does it already).
The memory debugging tools (Wipeout, etc.) that have been mentioned before will directly hook into the exec.library management functions. You might get a clearer indication of where the memory was trashed than if you were using the 'C' runtime library malloc(), etc. functions which add some overhead and thereby obscure the problem. Note that Wipeout can (if you are lucky) only tell you which memory was overwritten, but not necessarily which piece of code caused it to be overwritten.
My advice would be not to put too much hope into tools such as Wipeout. Memory trashing bugs are extremely difficult to diagnose, not just under AmigaOS.
I my experience the only approach which works is by plastering the suspicious code with debug output and tests and stepping through it over and over again until you are confident that it works correctly without misbehaving. That's a lot of work waiting to be done, I can tell you :(
In a situation such as this I would begin by writing new debug code which displays which memory range is affected by the operations to be performed. What's the first byte that will be copied, what's the last one. Next, add code which verifies that this range is correct and only changes memory which your program is permitted to use (e.g. by having allocated it).
If you want to be safe, print that information and add a call to getchar() so that you will have to hit the Return key for the code execution to continue. That way, you will at least get to see the output before the system crashes.
Make sure that you can trust your code to do what you assume it does. When you are copying NUL-terminated strings, you need to be confident that only as much data is being copied as the memory it is copied to will hold. If your code works well with short strings, but not so much with long strings, you'll have to work on your boundary checking.
-
Note that Wipeout can (if you are lucky) only tell you which memory was overwritten, but not necessarily which piece of code caused it to be overwritten.
In a certain other AOs clone where it's easy to modify OS source on demand for debugging purposes you can use some tricks:
- allow triggering of memory wall checks of all allocations at any time when app calls some easily reachable function (like hack this into AvailMEM(MEMF_CLEAR)). Whenever anyone calls AvailMem(MEMF_CLEAR) all memory walls of all allocations are checked for mem trashes.
- use this all over the place in the code to debug.
- for tricky cases modify exec mem alloc functions to be be disable/enable protected instead of forbid/permit, and then in the Exec task switching routines trigger the mem trash check. To find out under which task mem trashes happen.
- for other tricky cases use gdb (the OS as a whole runs under gdb, not single tasks or apps) hardware watch points (no slowdown) if you know what is trashed, but not when and by whom. With gdb it is possible to install/uninstall watch points dynamically. For example have a normal breakpoint somewhere at the start of a function and one at the end of a function. Then when hitting first breakpoint have gdb automatically install a hardware watch point on a certain local address and immediately continue running again. When hitting the second breakpoint, have gdb automatically remove hardware watch point, and immediately continue running again. So the thing keeps running and running and only when the hardware watch point triggers the debugger stops.
For AOS itself some of this stuff may be possible to do more easily with an emulator (maybe after enhancing debugging features of emulator some more).
-
- allow triggering of memory wall checks of all allocations at any time when app calls some easily reachable function (like hack this into AvailMEM(MEMF_CLEAR)). Whenever anyone calls AvailMem(MEMF_CLEAR) all memory walls of all allocations are checked for mem trashes.
In certain existing legacy 68K binary versions of AmigaOs, tools like MuGuardianAngel exist that do exactly that. Amongst other things. (-:
- for other tricky cases use gdb (the OS as a whole runs under gdb, not single tasks or apps) hardware watch points (no slowdown) if you know what is trashed, but not when and by whom.
Hardware watchpoints are unfortunately not commonly supported under 68K. Or rather, none of the legacy Motorola processors offer such features. However, some of them come with a MMU which can detect certain out-of-bounds accesses into free memory space.
This is something you can combine with a live or post-mortem debugger. But beware, it's not an easy terrain and source-level debugging is then out of the question.
-
Isn't the Action Replay cartridge able to do HW watchpoint?
Kamelito
-
OK. Another question regarding AllocPooled.
AllocPooled doesn't track the allocated size.
So, what's your advice for how to handle that, say, if the allocation has taken place in some function and the caller is responsible for freeing.
Would I pass an output parameter to the function that should be filled with the allocated size, or?
Manfred
-
OK. Another question regarding AllocPooled.
AllocPooled doesn't track the allocated size.
So, what's your advice for how to handle that, say, if the allocation has taken place in some function and the caller is responsible for freeing.
Would I pass an output parameter to the function that should be filled with the allocated size, or?
Manfred
You could roll your own AllocVecPooled(), like so:
APTR
AllocVecPooled(void * memory_pool, LONG size)
{
APTR result = NULL;
if (size > 0)
{
ULONG allocation_size;
ULONG * allocation;
allocation_size = sizeof(*allocation) + (ULONG)size;
allocation = AllocPooled(memory_pool, allocation_size);
if (allocation != NULL)
{
allocation[0] = allocation_size;
result = &allocation[1];
}
}
return(result);
}
VOID
FreeVecPooled(void * memory_pool, APTR memory)
{
if (memory != NULL)
{
ULONG * allocation = memory;
FreePooled(memory_pool, &allocation[-1], allocation[-1]);
}
}
However, I would strongly recommend debugging the code which gives you trouble before you consider switching to a different memory management scheme.
-
before you consider switching to a different memory management scheme.
Btw, it sometimes does help to hack memory management functions ~"down" to lowest level, ie. AllocMem, FreeMem. For example AllocPooled -> AllocMem, FreePooled -> FreeMem (ignoring for a moment side effect like DeletePool no longer freeing all allocations still in the pool), but also malloc -> AllocMem or free -> FreeMem.
In case debugging tool (like original Mungwall?) otherwise does not see and monitor all the individual single allocations from the higher level, but only bigger chunks at lower level with multiple higher end allocations embedded in chunks.
-
Btw, it sometimes does help to hack memory management functions ~"down" to lowest level, ie. AllocMem, FreeMem. For example AllocPooled -> AllocMem, FreePooled -> FreeMem (ignoring for a moment side effect like DeletePool no longer freeing all allocations still in the pool), but also malloc -> AllocMem or free -> FreeMem.
In case debugging tool (like original Mungwall?) otherwise does not see and monitor all the individual single allocations from the higher level, but only bigger chunks at lower level with multiple higher end allocations embedded in chunks.
You may be able to obtain more information about the problem in your program by replacing the memory allocation functions it uses. Every memory allocation and deallocation must be followed up by a consistency check, but that won't necessarily catch misbehaving code while it's trashing memory beyond an allocation boundary. You can get lucky, though. Figuring out what went wrong becomes a little bit easier, but not that much easier.
You can raise the level of this consistency and boundary checking to the operating system level through MungWall of old or Wipeout, but you will invariably get a lot of noise mixed in with the information you are after. The system will also become a lot slower due to the consistency checking and "wipe after allocate" and "wipe after free" processing that is being done, now that it's happening on the global level and not just in your own program.
One reason why MungWall was written was to catch cases in which memory was used after it was freed, or the contents of memory just allocated were used under the assumption that its contents were well-known. Discovering that memory contents were trashed was a side-effect of this.
Your compiler runtime library may feature built-in memory debugging tools. I specifically built those into my own clib2 (https://github.com/adtools/clib2). True story: not even these tools helped me to track down the one big/tiny bug which kept Samba 2.0.7 (which was the reason why clib2 was created in the first place) from working properly. The bug was in strtok(), and I only managed to fix it back in October 2004. It had been in there for more than two years, and the effects of it misbehaving were not getting caught by the memory debugging tools built into clib2.
-
I could partly narrow it down to allocating dynamic memory instead of using stack data.
It's likely that you've got a buffer overrun and you've changed from corrupting the stack to corrupting the heap & it just happens that it can tolerate the heap becoming corrupt.
You need to put it back so that it crashes and then dismantle the program so that it does less until it stops crashing. There are times when I've not been able to spot my dumb mistakes, so I just throw away the code and start again.
In the 90's I knew someone who went days trying to figure out why he was getting corrupt memory and in the end he had mixed up some strcmp and strcpy calls.
-
It's likely that you've got a buffer overrun and you've changed from corrupting the stack to corrupting the heap & it just happens that it can tolerate the heap becoming corrupt.
Some explanation: "heap" is a Unix term for the memory your program can manage through functions such as malloc() and free(). By comparison, "stack" is storage space available to your program which does not need to be managed by malloc() and free(). You define local variables in your program's functions, and those come from the "stack".
You need to put it back so that it crashes and then dismantle the program so that it does less until it stops crashing. There are times when I've not been able to spot my dumb mistakes, so I just throw away the code and start again.
Easier said than done ;) It's possible that you might just repeat the same mistake again.
There is some value in trying to understand what the problem actually is, although at the time it will feel like some the worst spent minutes or hours of your life :(
In the 90's I knew someone who went days trying to figure out why he was getting corrupt memory and in the end he had mixed up some strcmp and strcpy calls.
Tell me about it... The 'C' runtime library design is as much a double-edged sword as the language itself. Scores of books have been written about how not to fall into the traps you unknowingly set for yourself by using 'C' (one of the first books being "'C' traps and pitfalls" by Andrew Koenig).
From my experience, the books do have value (e.g. Andrew Koenig's book details defensive programming measures which help you to cope with the side-effects of the language design which are unavoidable), but in the end remembering all the likely and some of the possible ways you should avoid is a burden.
What helps?
Learn how the 'C' language design may trip you up, because while you may be able to build a better set of functions and data structures for managing strings (for example), you cannot do the same with the 'C' language.
The 'C' programming examples and literature you are using to learn 'C' programming might be referencing programming practices which are long obsolete. It makes for more compact example code which is easier to understand, but that comes with a cost. To use functions such as sprintf(), strcpy(), strcat() or memcpy() in your program may seem the obvious choice, but they are not. All of these functions are unsafe to use because you have to be acutely aware of how much data they write to the destination: they don't know when to stop, or why to stop.
I rewrote my "term" application (http://aminet.net/comm/term/term-main.lha) in around 1995 to use snprintf(), strlcpy(), strlcat(), memmove(), etc. It was quite the humbling experience to learn how these changes improved the overall stability of the program. Bugs came to light which were impossible to spot, because they were caused indirectly by buffer overruns and memory corruption.
Finally, it may make sense to write your own little library of functions which do in about what the 'C' runtime library takes care of, but which gives you more control and insight into how they are being used. For example, the 'C' runtime library performs little to no sanity checking on function parameters. You could write your own functions which does. The 'C' runtime library contains many functions whose parameter order is inconsistent (e.g. first parameter may be the input, or it may be the last parameter, or the second). You could write your own functions which are more consistent.
There are reasons for sticking with the 'C' runtime library, such as that it is likely to be very well-tested and almost free of bugs. But design decisions made in the 1970'ies, leading to lack of consistency and lack of sanity checking, are not the kind of "bugs" which a well-tested runtime library will resolve. You might be better off making your own choices.
-
So, thanks for your answers.
I'm now trying to add more fine grained test options, like allowing each .c file execute the main() method with test options.
Hope to have some more time on the WE.
Manfred
-
So I've narrowed it down to this one, which is weird:
char* ConvertString2HexString(char *string, long len, char *outHex) {
printf("string len: %ld\n", len);
return string;
}
int main(void) {
char foo[] = "This is my way to say hello!";
long len = strlen(foo);
printf("len of foo: %ld\n", len);
char hexOut[len*2+1];
ConvertString2HexString(foo, len, hexOut);
return 0;
}
This code when compiled on Amiga with VBCC 0.9f gives me this output:
len of foo: 28
string len: 7168
Compiling this on macOS (on some C compiler which is used by Xcode) I get:
len of foo: 28
string len: 28
So, what the heck?
Obviously when using this string len in real it goes far beyond the real string length and overrides some other memory.
Manfred
-
Compiling this with GCC 2.95 (which doesn't support C99, so I've swapped the printf line later in the source).
I'm, also getting:
len of foo: 28
string len: 28
So WTF, is there something wrong with VBCC?
-
char foo[] = "This is my way to say hello!";
While correct, it is probably worth noting what it does: a) it creates an array of characters, and b) it copies the string from the right into the array just created. While this might be just what you want, let me note that this is different from
const char *foo = "This is my way to say hello!";
which omits the copy and just keeps the pointer to the string. Note that the type of the right hand side is "const char[]", not "char []".
char hexOut[len*2+1];
This creates a VLA, which is a language element that came in at C99. Not all compilers support VLAs, most notably SAS/C does not - it only supports C90. You also make a definition in the middle of a statement block, which is another extension.
If you want to be compatible to C90, the only chance is to allocate the object dynamically:
hexOut = malloc(len * 2 + 1);
and declare "hexOut" as "char *hexOut" along with "foo" above.
So, what the heck?
Well, as you just found out, VBCC does not support VLAs correctly, even less so if they are defined in the middle of a function block. Yes, it's a violation of the C99 standard, so a compiler defect. However, as stated above, I would avoid C99 features in first place in Amiga programs as only few compilers support them, and even less support them correctly, as you have just found out. (-:
What most likely happened is that VBCC got completely lost in its stack frame, i.e. the stack offset of "len" changed in the middle of the function, apparently an effect the code generator did not properly take care of.
%&$#?@!%&$#?@!%&$#?@!%&$#?@! happens if you depend on exotic language features.
-
While correct, it is probably worth noting what it does: a) it creates an array of characters, and b) it copies the string from the right into the array just created. While this might be just what you want, let me note that this is different from
const char *foo = "This is my way to say hello!";
which omits the copy and just keeps the pointer to the string. Note that the type of the right hand side is "const char[]", not "char []".
I just want a string. Then I guess the pointer version is better suited.
This was just to create a snipped to reproduce the error.
In the real version that string comes from the serial device.
My pure C programming is more than 10 years ago. Since then I developed in higher abstracted languages.
So I have to adapt a bit to the C details.
Well, as you just found out, VBCC does not support VLAs correctly, even less so if they are defined in the middle of a function block. Yes, it's a violation of the C99 standard, so a compiler defect. However, as stated above, I would avoid C99 features in first place in Amiga programs as only few compilers support them, and even less support them correctly, as you have just found out. (-:
What most likely happened is that VBCC got completely lost in its stack frame, i.e. the stack offset of "len" changed in the middle of the function, apparently an effect the code generator did not properly take care of.
%&$#?@!%&$#?@!%&$#?@!%&$#?@! happens if you depend on exotic language features.
I guess I'll report that to Frank Wille and/or Volker Barthelmann.
VBCC only supports a subset of C99, maybe I ran into something that isn't supported.
Since I'm now basically back porting this program from MorphOS where you have GCC 4 and 5 it might be a good idea to stick to C90 to have as much code combatibility as possible between the Amiga OS flavours.
Manfred
-
I just want a string. Then I guess the pointer version is better suited.
Both are "strings" (C doesn't really have strings), though the first is a "char []", hence can be modified, whereas the second is a "const char *", and hence cannot be modified (note the "const"). Or rather, "should not be modified", and if you try, strange and wonderful things may happen. On the Amiga, most likely nothing will happen, though, and you probably will not notice the difference. It's more a "Mr. Language Lawer" argument on this particular machine. On Linux, you'll get a core dump, though, as soon as you'd try to write to the const array (aka "string").
I guess I'll report that to Frank Wille and/or Volker Barthelmann.
Please do. In this particular case, VLAs are probably a hassle to support since the stack offsets of the objects with function scope are no longer constant, so it's easy to understand why that fails.
-
char hexOut[len*2+1];
It looks like a compiler bug. One of the compilers I have to support didn't work with variable length arrays until very recently, I still avoid them.
I never used vbcc, I always stuck to gcc 2.95 and sas/c. But vbcc is still supported, so you may be able to get it fixed. It's a pity there isn't a modern gcc or clang build targeting 68k.
"Please support vbcc by contacting the authors if you find any bugs or problems. Supporting eight different architectures makes testing extremely time consuming, so this release is probably not free of bugs.
For problems with the compiler core contact Dr. Volker Barthelmann (vbemail), and for Amiga/Atari-specific problems, including assembler, linker, startup-codes and linker-libraries, contact Frank Wille (fwemail)."
-
Thanks, I'll try those tools.
Manfred
Hi all!
Me too :)
Also, I had compiled a list of other tools for debugging/preventing memory problems in C here (http://www.chingu.asia/wiki/index.php?title=Tracking+and+fixing+memory&from=Programming). In particular, I love Fortify (http://aminet.net/package/dev/c/fortify22) and use it in all my programs now, it is really excellent at catching (some) problems!