-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Always type erase drivers #2572
Comments
I'd like to point out that there is only a single place where this makes any difference:
|
One place where optional type retention makes sense is GPIOs, where the extra instance dispatch is probably comparable to the IO speed. Are we okay with treating GPIOs as an exception here? |
Thirdly, we can opt back into type checking DMA channels if we approach constructors differently. Instead of This would make it harder to implement |
I would be okay with that. It would be nice to maybe do a small benchmark on this to see the difference. |
Following on from the discussion here: #2573 (comment): I was looking at our drivers, and future chips, namely the P4. Fortunately I don't think we'll have any issues with our current driver set, with the possible exception of USB, the P4 has one full speed usb and one high speed usb, and the driver is currently not instanced. It's not clear to me how much the two will share, so maybe it's not an issue in this case. I think this leads us towards type erase by default though, because it only takes a new chip to ruin this. We could work around this later down the line with cfgs etc but I'd rather avoid the issue entirely. |
My LA isn't the best, but on an ESP32: Typed
#![no_std]
#![no_main]
use esp_backtrace as _;
use esp_hal::{
delay::Delay,
gpio::{GpioPin, Input, Level, Output, Pin, Pull},
prelude::*,
};
#[entry]
fn main() -> ! {
let peripherals = esp_hal::init(esp_hal::Config::default());
// Set LED GPIOs as an output:
let led = Output::new_typed(peripherals.GPIO2, Level::Low);
toggle_pins(led)
}
fn toggle_pins(mut pin: Output<GpioPin<2>>) -> ! {
loop {
pin.toggle();
}
} Erased
#![no_std]
#![no_main]
use esp_backtrace as _;
use esp_hal::{
delay::Delay,
gpio::{Input, Level, Output, Pin, Pull},
prelude::*,
};
#[entry]
fn main() -> ! {
let peripherals = esp_hal::init(esp_hal::Config::default());
// Set LED GPIOs as an output:
let led = Output::new(peripherals.GPIO2, Level::Low);
toggle_pins(led)
}
fn toggle_pins(mut pin: Output) -> ! {
loop {
pin.toggle();
}
} There seems to be a ~3x difference in this one experiment. |
So I'm happy with GPIO being the exception. I'm now just wondering what happens if we want to make this exception again in the future, or maybe there are some other drivers that exist now that might benefit from it (RMT maybe? latency is quite important there). |
GPIO is an exception in another matter: we currently must use enums to represent the AnyPin, because not everything can be neatly indexed. With other peripherals this isn't the case - the Any peripheral can be a pointer to the Instnance (or the register block if critical) (or two pointers with State), which may be easier propagated by the compiler. |
Just wanted to sprinkle my 2 cents. For example SPI2 can do things that SPI3 can't, how do you want to rectify this discrepancy without type safety. I2S1 can do things that I2S0 can't on the base ESP32, albeit not very well documented. |
Whilst these are good points, I think they are a bit orthogonal to this issue, as we'll have to solve this regardless because erasure is already implemented. The short answer is a runtime check, but depending on the functionality disparity we may be able to keep some type safety checks. For instance if a driver doesn't support some large'ish mode, like slave mode we can probably type check that on the constructor. If some config/submode on a specific driver isn't supported it will probably be a runtime error of some kind. I think these need to be assessed on a case-by-case basis, there won't be a silver bullet. |
Wildly different hardware features may also be best implemented as different drivers, which aligns well with @MabezDev's typecheck-in-constructor point. |
Is this really the case by the way? I understand that we have different banks of GPIOs, and that might make things a bit harder, but I think it should still be doable? With adequate perf from the type erased GPIO driver (doesn't have to be stellar, just not 3 times slower :D) it might be able to make the decision for us. |
I guess we are already switching on the value for stuff so we could just as well switch on the pin number, but at least with an enum we don't have unreachable branches... |
Just out of curiosity I looked at the generated assembly from the GPIO test (for C6) Erased 420001b4: c38c sw a1,0(a5)
})
}
fn is_set_high(&self, _: private::Internal) -> bool {
handle_gpio_output!(&self.0, target, {
OutputPin::is_set_high(target, private::Internal)
420001b6: 00482783 lw a5,4(a6)
420001ba: 00d7f533 and a0,a5,a3
420001be: 87ba mv a5,a4
OutputPin::set_output_high(target, high, private::Internal)
420001c0: d975 beqz a0,420001b4 <_ZN3foo11toggle_pins17haabe4394cfcb062aE+0x34>
420001c2: 87b2 mv a5,a2
420001c4: bfc5 j 420001b4 <_ZN3foo11toggle_pins17haabe4394cfcb062aE+0x34> Non-erased 420001d0: c314 sw a3,0(a4)
intrinsics::volatile_load(src)
420001d2: 4158 lw a4,4(a0)
self.gpio_bank(private::Internal).read_output() & self.mask() != 0
420001d4: 00477793 and a5,a4,4
420001d8: 8732 mv a4,a2
420001da: dbfd beqz a5,420001d0 <main+0x52>
420001dc: 872e mv a4,a1
420001de: bfcd j 420001d0 <main+0x52> Which is surprising! And also the LA shows pretty much the same for both tests |
On ESP32 Not-erased 1.67MHz Asssembly looks almost like C6 400d0665: 0020c0 memw
400d0668: 0c99 s32i.n a9, a12, 0
400d066a: 0020c0 memw
400d066d: 08c8 l32i.n a12, a8, 0
self.gpio_bank(private::Internal).read_output() & self.mask() != 0
400d066f: 10dc90 and a13, a12, a9
400d0672: 0bcd mov.n a12, a11
if set {
400d0674: ed1d77 beq a13, a7, 400d0665 <main+0x35>
400d0677: 0acd mov.n a12, a10
400d0679: fffa06 j 400d0665 <main+0x35> Erased 0.63MHz The assembly looks very confusing with lots of |
To confirm that this might be a Xtensa backend issue, would you be able to do the test on the least weird Xtensa chip (esp32s3)? The esp32 has some panicking code paths that might affect optimisation in general, so I just want to rule that out. |
Sure .... Erased: 0.69MHz (And pretty much same assembly code) Same thing 😮 |
Erased is faster??? |
Ooops - fixed |
It should be interesting to see what happens to the C2. My current working hypothesis is that the "hole" in the GPIO range prevents optimizing the enum dispatch away. |
It makes some difference in terms of optimizations in other places but the actual code which toggles the GPIO looks the same C2 Erased 1.90 MHz 4200017c: c314 sw a3,0(a4)
4200017e: 4158 lw a4,4(a0)
42000180: 00477793 and a5,a4,4
42000184: 8732 mv a4,a2
42000186: dbfd beqz a5,4200017c <main+0x4a>
42000188: 872e mv a4,a1
4200018a: bfcd j 4200017c <main+0x4a> Not-Erased 1.90 MHz 42000168: c38c sw a1,0(a5)
4200016a: 00482783 lw a5,4(a6)
4200016e: 00d7f533 and a0,a5,a3
42000172: 87ba mv a5,a4
42000174: d975 beqz a0,42000168 <_ZN10non_erased11toggle_pins17h2ebd1791ed089c58E+0x34>
42000176: 87b2 mv a5,a2
42000178: bfc5 j 42000168 <_ZN10non_erased11toggle_pins17h2ebd1791ed089c58E+0x34> |
Ok mystery solved (Daniel was right about inlining being the key) ESP32 and S3 have a lot more pins so the enum-dispatching functions generate more code which makes the code-gen not inline the function and the general dispatching function obviously cannot get optimized In the end it doesn't change the conclusion here - GPIOs benefit from being typed. In other places we don't have many variants to dispatch to, so they should get optimized fine |
If this is just inlining, but otherwise setting/reading pins is equally fast in the two versions, I will have to disagree. We should try and refactor the driver to optimize better. It is possible we can get away with The time it takes to configure a pin is less important, but if actually working with the pins is the same, the performance argument to maintain the optional typed API is not a real one. |
True - just for fun I sprinkled the code with some |
The pathological device is ESP32, where some pins are not output pins - we have a big generated Removing this match, and just yolo-calling raises the achievable frequency from ~660kHz to 1MHz for me. Better (but incorrect), but still far from the 1.8 in my previous test. |
With the refactor of GPIO in #2854, I think we're closer to erase by default, and I'm inclined to make that decision. I'm looking for any final comments as to why we shouldn't always type erase the drivers. |
Let's erase by default. To complete this work we need to:
|
new_typed
vs always type erasing drivers
For our drivers we have the option of opting into the
new_typed
method to create a driver without type erasure, i.eSpi<SPI2, Blocking>
.I'm wondering what use cases this opens up (if any) and whether we should always type erase.
I believe having the concrete types helps at least with DMA channels on PDMA devices, as we can (at compile time) check a user isn't passing an invalid DMA channel for the peripheral. We still have a runtime check for the case where they pass
AnyDmaChannel
, but there is merit to a compile time check.One downside is that our compiler allocated "inference slot" (the last generic param can be inferred by rustc automatically, but any others can't) it taken by the instance type. This means if we ever want the compiler to infer a generic parameter we can't.
The text was updated successfully, but these errors were encountered: