Skip to content

Conversation

@fjmolinas
Copy link
Contributor

@fjmolinas fjmolinas commented May 7, 2019

Contribution description

This PR modifies xtimer_usleep automated test to recognize negative offset with respect to target sleep time.

When reviewing #11044 I realized tests/xtimer_usleep wasn't passing because the python script was not recognizing negative integers as acceptable values for the timer offset, this is done now. was failing because of a pexpect timeout and not because of negative offsets (sleeping less than the specified time).

Testing procedure

With frdm-kw41z or usb-kw41z (or any board witch would produce negative offset with respect to the target usleep time)

→ make -C tests/xtimer_usleep BOARD=usb-kw41z flash test

Failed without this PR, succeeds with it. Failes correctly with this PR

Issues/PRs references

@fjmolinas fjmolinas added Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) Area: tests Area: tests and testing framework labels May 7, 2019
@fjmolinas fjmolinas requested review from aabadie and cladmi May 7, 2019 06:54
@kaspar030 kaspar030 added CI: run tests If set, CI server will run tests on hardware for the labeled PR CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR labels May 7, 2019
cladmi
cladmi previously requested changes May 7, 2019
Copy link
Contributor

@cladmi cladmi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is is not the test, the test is correct.
When we say sleep 10us it should sleep at least 10s 10us.

There is an issue in sys/include/xtimer/tick_conversion.h where for some configuration if you say, give me the number of ticks for 1s it returns less than enough ticks for 1s so sleeps 0.9something which is wrong.

@fjmolinas
Copy link
Contributor Author

Aha!, thanks for the clarification @cladmi. When you say there is an issue with sys/include/xtimer/tick_conversion.h , are you talking about a specific opened issue? I couldn't find one.

I'll close this PR after pointing or opening an issue (if there isn't one already).

@kaspar030
Copy link
Contributor

When we say sleep 10us it should sleep at least 10s.

More like "as close to 10us as possible". 9.99us is more accurate than 10.02.

@cladmi cladmi requested a review from MichelRottleuthner May 7, 2019 12:32
@fjmolinas
Copy link
Contributor Author

fjmolinas commented May 7, 2019

More like "as close to 10us as possible". 9.99us is more accurate than 10.02.

@cladmi @kaspar030 So what is actually the expected behavior:
A) sleep as close as target time
B) sleep at least target time

This in not clear from the documentation., which ever the case I think we should make it more explicit.

@cladmi
Copy link
Contributor

cladmi commented May 7, 2019

When we say sleep 10us it should sleep at least 10s 10us.

More like "as close to 10us as possible". 9.99us is more accurate than 10.02.

So in a timesloted networking protocol if I say:

next_timeslot_start = now + convert(10ms);
sleep_until(timeslot_start); // or with delay, its for the idea
send_packet();

I may just send in the previous timeslot everytime.

I do not feel it more accurate to send a message before I am allowed to.
I do not feel it more accurate to write to a hardware before its ready.

That MUST be documented in the API and tell how to proceed if one wants to be sure that he can do:

t0 = now();
timer_set(10ms, callback, arg=t0);

and have an the callback:
t0 = arg;
assert(now() > t0 + 10ms);

From the timer libraries I remember, they were all more ensuring it is done at least after/waits longer than the time specified. Like posix guarantees.

@kaspar030
Copy link
Contributor

So in a timesloted networking protocol if I say:

I think we need to keep the timescales in context. I'd expect my timer subsystem to quantize any real time interval to the to the closest hardware timer tick instead of always rounding up. That would make it at most +/- half a tick wrong, being spot on on average. Always rounding up would mean it is between 0 and a whole tick late (half a tick on average).

Your argumentation for a timeslot would shift the problem to the end of the timeslot (always rounding up might make the protocol exceed its timeslot every time).

In practice, we're talking +- 0-15us vs +0-30us for a 32kHz timer quantized to 1000000us. The better option here is to either work with the low-level ticks or chose a faster timer. Or chose one that corrects in software via spinning, but that is inefficient.

I think we agree that the timer peripheral itself must not sleep too little. Is that the issue here?

@fjmolinas
Copy link
Contributor Author

I think we agree that the timer peripheral itself must not sleep too little. Is that the issue here?

What is too little? Is -15us too little?

@cladmi
Copy link
Contributor

cladmi commented May 7, 2019

Returning before, whatever the amount is, is not what I would expect as a user and it is not what the API says.

If my resolution is 30us, I know I cannot get better precision than a few multiple of 30us. Also I should take into account all the delays to know what I should have in the worst case and I must deal with the worst case as it can happen. If my precision is lower that what I need, I am screwed.

The usage currently assumes there is a way to have a timer after an expired time.
The sema module does this which used by posix_semaphore.

tests/posix_semaphore on frdm-kw41z.

2019-05-07 16:20:51,089 - INFO # ######################### TEST4:
2019-05-07 16:20:51,090 - INFO # first: sem_init s1
2019-05-07 16:20:51,094 - INFO # first: wait 1 sec for s1
2019-05-07 16:20:52,093 - INFO # first: timed out
2019-05-07 16:20:52,098 - INFO # first: waited only 999970 usec => FAILED
2019-05-07 16:20:52,099 - INFO # ######################### DONE

And here it also happened after 30us

2019-05-07 16:37:30,761 - INFO # ######################### TEST4:
2019-05-07 16:37:30,761 - INFO # first: sem_init s1
2019-05-07 16:37:30,765 - INFO # first: wait 1 sec for s1
2019-05-07 16:37:31,764 - INFO # first: timed out
2019-05-07 16:37:31,769 - INFO # first: waited 1000031 usec
2019-05-07 16:37:31,770 - INFO # ######################### DONE

So here code using xtimer must assume tick difference in the future and the past, not only for sleep but also for xtimer_set so the callback done in interrupt context.

And for the precision, you can only assume the guarantee that can happen in the worst case. If one high priority thread can do small sleeps, a lower priority thread should already assume it can be descheduled for XTIMER_BACKOFF ticks at any time as the other thread could have xtimer_usleep spin up to XTIMER_BACKOFF ticks.
And if this happens just before calling the xtimer_usleep/xtimer_set function, it would be at least 5 ticks after what is expected as it is a relative time.

With timing, I would only consider the worst case.

@miri64 miri64 added CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR CI: run tests If set, CI server will run tests on hardware for the labeled PR and removed CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR CI: run tests If set, CI server will run tests on hardware for the labeled PR labels May 7, 2019
@miri64
Copy link
Member

miri64 commented May 7, 2019

Murdock hangs on this PR for over 8 hours so I requeued it.

@MrKevinWeiss
Copy link
Contributor

Just to chime in, I think the time should wait at least the time specified (both for sleep functions and for timer periph functions). A case is if you expect something to be triggered after your sleep but you wake up too early you take the value of a previous event.

If everyone agrees on that then we probably shouldn't merge the test change or, if everyone disagrees (at least on the sleep perspective), we should update the docs to say this is the behaviour of sleep as my assumption for sleep is at least not about... Maybe update the docs either way...

@MichelRottleuthner
Copy link
Contributor

MichelRottleuthner commented May 8, 2019

(...) a lower priority thread should already assume it can be descheduled for XTIMER_BACKOFF ticks at any time (...)

This illustrates why trying to hit the point as precisely as possible (maybe ahead of time) doesn't really gain anything.

That would make it at most +/- half a tick wrong, being spot on on average.

This would only be true if there were no delays introduced such as with the above effect of (de)scheduling.

As "spot on" can not be guaranteed anyway I'd also vote for "wait at least for the provided duration".
That way we have at least a well defined guarantee in one direction. When very precise timing is required, sleep functions also wouldn't be the goto solution.

@jcarrano jcarrano added the Discussion: contested The item of discussion was contested label May 8, 2019
@kaspar030
Copy link
Contributor

As "spot on" can not be guaranteed anyway I'd also vote for "wait at least for the provided duration".
That way we have at least a well defined guarantee in one direction.

Ok, you guys have convinced me. +1 for never sleeping too little, as in, Tafter >= Tbefore + interval.

@fjmolinas fjmolinas force-pushed the pr_xtimer_usleep_negative branch from f2e6661 to b124a02 Compare May 13, 2019 09:11
@fjmolinas
Copy link
Contributor Author

Ok so we have an agreement over the expected behaviour for xtimer_usleep(TARGET_TIME). I still think the test is wrong, it should recognize the negative offset but fail according to this. Right now the test is defining an upper and lower 5% error margin for the sleep time (INTERNAL_JITTER).

Nonetheless since the expect REGEX is not capturing negative values it is failing whatever negative value it is getting but not in the right way. I have changed the test so the lower bound is equal to TARGET_TIME and the upper bound keeps the 5% margin.

The result is having the test fail as:

2019-05-13 11:02:14,078 - INFO # main(): This is RIOT! (Version: 2019.07-devel-162-gcd0ab-pr_xtimer_doc)
2019-05-13 11:02:14,081 - INFO # Running test 5 times with 7 distinct sleep times
2019-05-13 11:02:14,083 - INFO # Please hit any key and then ENTER to continue
a
2019-05-13 11:02:14,155 - INFO # Slept for 9979 us (expected: 10000 us) Offset: -21 us
Invalid timeout 9979 ,expected 10000 < timeout < 10500
Host max error  500
error           -21

Instead of:

2019-05-13 11:03:02,812 - INFO # Slept for 12116 us (expected: 12122 us) Offset: -6 us
2019-05-13 11:03:02,922 - INFO # Slept for 98754 us (expected: 98765 us) Offset: -11 us
2019-05-13 11:03:02,997 - INFO # Slept for 74981 us (expected: 75000 us) Offset: -19 us
2019-05-13 11:03:02,998 - INFO # Test ran for 1732575 us
Timeout in expect script at "child.expect(u"Slept for (\\d+) us \\(expected: (\\d+) us\\) Offset: (\\d+) us")" (tests/xtimer_usleep/tests/01-run.py:36)

@cladmi @kaspar030 Does this change make sense?

@fjmolinas fjmolinas changed the title tests/xtimer_usleep: recognize negative offset tests/xtimer_usleep: fail with negative offsets May 13, 2019
@cladmi
Copy link
Contributor

cladmi commented May 13, 2019

I am running the test on many boards right now.

I would like to have the explanation that the time for usleep should not be in the past emphasized in the commit description too.

Otherwise I agree with the change.

@cladmi
Copy link
Contributor

cladmi commented May 13, 2019

I ran the test successfully on:

arduino-mega2560 frdm-k64f iotlab-m3 msba2 mulle native nrf52dk nucleo-f103rb pba-d-01-kw2x sltb001a stm32f3discovery

And it failed as expected on frdm-kw41z.

@cladmi cladmi dismissed their stale review May 13, 2019 16:20

Only need to update commit message, otherwise I agree.

- xtimer_usleep(timeout) should sleep for at least timeout us.
  Negative offset, i.e. sleeping less than the specified time
  is incorrect.
@fjmolinas fjmolinas force-pushed the pr_xtimer_usleep_negative branch from b124a02 to a687a26 Compare May 14, 2019 07:03
@fjmolinas
Copy link
Contributor Author

I would like to have the explanation that the time for usleep should not be in the past emphasized in the commit description too.

@cladmi not sure what you mean by not beeing in the past, in xtimer_usleep(timeout) timeout is always in the future. I did add more detail in the commit stating that sleeping less that the specified time is incorrect behavior.

@cladmi
Copy link
Contributor

cladmi commented May 14, 2019

Sorry, I meant in the past from the expected end time, your comment said it correctly.

Copy link
Contributor

@cladmi cladmi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK, review and tested. It now fails with an appropriate message on the frdm-kw41w.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area: tests Area: tests and testing framework CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR CI: run tests If set, CI server will run tests on hardware for the labeled PR Discussion: contested The item of discussion was contested Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants