Files

T

Kari Argillander 0177c59f4a Use pre-commit and editorconfig (#753 )

* Add editorconfig and pre-commit config

Editorconfig is widly used and supported file. It says basic things how
files should be formatted.

pre-commit is tool which can automatically check some basic checks like
code formatting everytime someone makes commit. This can also be used in
CI to run these things. Then it is very easy to do same things locally
as in CI. This also makes easy to select clang-format version so
everyone is using same one.

* clang-format: Ignore folders where are external code

We should not format external code. Add clang-format files to exclude
those. We should move external code always to example external/ folder
so we can exclude those more easily.

* clang-format: Remove custom zephyr/.clang-gormat

This clang-format file where introduces before our root clang-format. It
does not make sense anymore as we have root clang-format. Removing this
will unifie formatting in whole repo.

* clang-format: Add couple new rules

Add couple new formatting rules.

Always align const to left side. We did have only one place where it was
right side so this make sense as it is already rule for us.

I choose also insertbraces becuase when I run this I notice that we have
lot of multiline code without braces. So very error prone places. This
will take error possibility away. Repo also always use braces even with
single line statments so this does not matter much.

* ci: Add pre-commit validation

Validate pre-commit in CI.

* format: Convert spaces to tabs in Makefiles

Makefile normally use tabs. We enforce that with editorconfig. Fix
couple places where spaces where still in use.

---------

Co-authored-by: Kari Argillander <kari.argillander@fidelix.com>

2024-08-28 17:04:00 -05:00

tests

Feature/add memap cstack usage ports (#661 )

2024-05-31 14:39:25 -05:00

.editorconfig

Use pre-commit and editorconfig (#753 )

2024-08-28 17:04:00 -05:00

.gitignore

Feature/add memap cstack usage ports (#661 )

2024-05-31 14:39:25 -05:00

checkStackUsage.py

Feature/add memap cstack usage ports (#661 )

2024-05-31 14:39:25 -05:00

LICENSE

Improve SPDX identifier coverage (#716 )

2024-08-12 15:33:02 -05:00

Makefile

Feature/add memap cstack usage ports (#661 )

2024-05-31 14:39:25 -05:00

pylint.cfg

Feature/add memap cstack usage ports (#661 )

2024-05-31 14:39:25 -05:00

README.md

Feature/add memap cstack usage ports (#661 )

2024-05-31 14:39:25 -05:00

requirements.txt

Feature/add memap cstack usage ports (#661 )

2024-05-31 14:39:25 -05:00

README.md

(Detailed blog post here )

Introduction

This is a utility that computes the stack usage per function. It is particularly useful in embedded systems programming, where memory is at a premium - and also in safety-critical SW (blowing up from a stack overflow while you operate medical equipment or fly in space is not exactly optimal).

In detail

(Detailed blog post here )

You can read many details about how this script works in the blog post linked above; but the executive summary is this:

We expect the source code compilation to use GCC's -fstack-usage. This generates .su files with the stack usage of each function (in isolation) stored per compilation unit. Simply put, file.c compiled with -fstack-usage will create file.o and file.su.
The script can then be launched like so:

checkStackUsage.py binaryELF folderTreeContainingSUfiles

For example, if after the build we have a tree like this:

bin/
    someBinary
src/
    file1.c
    file1.o
    file1.su
    lib1_src/
            lib1.c
            lib1.o
            lib1.su
    lib2_src/
            lib2.c
            lib2.o
            lib2.su

...we run this:

checkStackUsage.py bin/someBinary src/

The script will scan all .su files in the src folder (recursively, at any depth) and collect the standalone use of stack for each function.

It will then launch the appropriate objdump with option -d - to disassemble the code, and create the call graph. Simplistically, it detects patterns like this:

<foo>:
    ....
    call <func>

...and proceeds from there to create the entire call graph. It can then accumulate the use of all subordinate calls from each function, and therefore compute it's total stack usage.

Output looks like this:

176: foo (foo(16),func(160))
288: func (func(288))
304: bar (bar(16),func(288))
320: main (main(16),bar(16),func(288))

...which means that function foo uses 176 bytes of stack; 16 because of itself, and 160 because it calls func. main uses 320 bytes, etc.

Notice that bar also uses func - but reports a larger stack size for it in that call chain. Read section "Repeated functions" below, to see why; suffice to say, this is one of the few stack checkers that can cope with symbols defined more than once.

Platforms

The script needs to spawn the right objdump. It uses file to detect the ELF signature, and uses appropriate regexes to match disassembly call forms for:

SPARC/LEONs (used in the European Space Agency missions)
x86/amd64 (both 32 and 64 bits)
32-bit ARM

Adding additional platforms is very easy - just tell the script what objdump flavor to use, and what regex to scan for to locate the call sequences; relevant code is here.

Repeated functions

Each function can only have a specific stack usage - right?

Sadly, no :-(

Feast yourself on moronic - yet perfectly valid - C code like this:

// a.c
static int func() { ...}
void foo() { func(); }

// b.c
static int func() { ...}
void bar() { func(); }

"Houston, we have a problem". While scanning the .su files for a.c and b.c, we find func twice - and due to the static, we want to use the right value on each call (from foo/bar) based on filescope. In effect, the .su files' content need to be read prioritizing local static calls when computing stack usage.

Hidden calls

The scanning of objdump output for call sequences is the best we can do; but it's not perfect. For example, any calls made via function pointer indirections are "invisible" to this process.

And since fp-based calls can do all sorts of shenanigans - e.g. reading the call endpoint from an array of functions via some algorithm - statically deducing which functions are actually called is tantamount to the halting problem.

I am open to suggestions on this.

Static Analysis

The script is written in Python - make all will check it with:

flake8 (PEP8 compliance)
pylint (Static Analysis)
...and mypy (static type checking).

All dependencies to perform these checks will be automatically installed via pip in a local virtual environment (i.e. under folder .venv) the first time you invoke make all.

Test example

The scenario of repeated functions is tested via make test; in my 64-bit Arch Linux I see this output:

$ make test
...
make[1]: Entering directory '/home/ttsiod/Github/checkStackUsage/tests'
==============================================
       176: foo (foo(16),func(160))
       288: func (func(288))
       304: bar (bar(16),func(288))
       320: main (main(16),bar(16),func(288))
==============================================
1. The output shown above must contain 4 lines
2. 'foo' and 'bar' must both be calling 'func'
   *but with different stack sizes in each*.
3. 'main' must be using the largest 'func'
   (i.e. be going through 'bar')
4. The reported sizes must properly accumulate
==============================================
make[1]: Leaving directory '/home/ttsiod/Github/checkStackUsage/tests'

Given the content of a.c, b.c and main.c, the output looks good. Notice that for func we report the maximum of the two (the one reported by GCC inside b.c, when it is called by bar).

Feedback

If you see something wrong in the script that isn't documented above, questions (and much better, pull requests) are most welcome.

Thanassis Tsiodras, Dr.-Ing. ttsiodras_at-no-spam_thanks-gmail_dot-com