Props to OP for using the term 'superstitions'. Wi-fi (and radio in general) is poorly understood even by people with engineering degrees, let alone the average Joe. This often leads to trying random stuff until an improvement is perceived (or believed to be...), then spreading this experience as proven and applicable in any context with no further research and confirmation of the results (let alone filing bug reports).
The prominence and accessibility (to laymen) of wifi does not help the situation either... I mean what else CAN they do except try stuff out, how are they going to determine that it worked and why it did?...
OP, yes, poorly implemented power saving has in my experience often been a culprit behind network reliability issues. That said, please consider adding a disclaimer that just disabling power saving in client devices without pinpointing the root cause of the instability or at least reporting it is exactly why we are in a situation where it is still buggy.
I love the term "superstition". I like to refer to lab "rituals" - if something works, repeat it, and never change things around "because it ought to work just as well". Life is too short to figure out all the unknown variables that might be affecting things, so seize luck when you find it.
Having a protocol for working with a black box is fine. The problem is how bad a lot of people are at differentiating between superstition, a black box, and a process with a well understood mechanism.
I’ve been referring to this kind of belief as a ‘pigeon religion.’ This one time an operator has a problem (it’s an e-stop that hasn’t been reset) and they ‘fixed it’ by switching the system into test mode and back, then pressing the e-stop reset three times. So they tell all their friends.
Next time that doesn’t work (the e-stop button is still pressed in) but someone tries switching the system into test mode and back twice, slamming the control cabinet door, pressing and releasing the e-stop button three times, then pressing the e-stop reset button three times. They report their findings on the new standard procedure…
I caught a friend's mother going through a ritual like this with the dishwasher. She had to press the heat mode or something on the dishwasher four times in a row before starting it (so it toggled it on and off twice). Who knows how this got started, but she was adamant that it had to be done before starting it.
If you have an ESP32C3 board with poor wifi performance, check if it's a boad with poor design that has the ceramic antenna is too close to the oscillator: https://roryhay.es/blog/esp32-c3-super-mini-flaw
YMMV but adding a loop of wire in parallel with the oscillator improved wifi performance on mine.
(I first tried moving the ceramic antenna a millimeter outward, but that made no difference.)
I was having multiple ESP devices from different brands running totally different firmwares all drop out randomly when I switched to a new Asus wifi router.
Came across even more 'work arounds'; No spaces in the SSID, disable IPv6 for the whole network even if the ESP ignores it. Thing is all of these settings would reboot the router and reconnect everything, so it would seemingly work until the next dropout.
I found limiting them to 802.11g instead of connecting with 'n' stopped the dropouts for good. Even now I wouldn't say that is a cure-all and that any of these other recommendations don't work, I'd guess that each AP's firmware might have different conflicts with different devices.
This one I've seen caused by N+ burst transmissions tanking the power rail, which causes acknowledgements to be dropped, which causes the esp to lower its TX rate so the TX takes longer...
> If your network hardware allows it, you should pin the device to the closest one.
In Wi-Fi it's always the client's choice on where to connect to at the end of the day and any hacks the APs try to do to steer clients are "suggestions" at best and "signal ruiners for everyone" at worst.
Alternatively, if you REALLY want to force it, the sanest way is to use a uniquely named IoT SSID on each AP so there would be no other option for those clients to choose to latch on to (and you can leave the other SSIDs shared for normal clients). E.g. "IoT-1", "IoT-2", "IoT-3" on 3 separate APs. It may clutter up device screens more when you list available networks but it's just visual because, as far as airtime, they all beacon just as often if the names are the same or not anyways.
> From what people and the internet tell me you should set the band width on the 2.4 Ghz network that your boards use to 20 Mhz, not 40, not 60, and definitely not automatic.
This is spot on in that 20 MHz is the ideal channel width on 2.4, doubly so for just an ESP32. Some things to add are I'd say it should apply to any of your 2.4 GHz networks unless you live out the sticks and really want to race a few extra mbps out at the fringe of your AP coverage (and even then the wider channel width is probably going to make your SNR worse even in the sticks). Also, I don't believe the ESP32 supports 60 MHz in the 2.4 GHz space at all (it's certainly not an option in the Wi-Fi standard).
I'll also tack on that the ESP32-C6 can be worth springing for if these kinds of thing are a particular concern as it support Wi-Fi 6, which has a few enhancements for connecting lots of IoT devices without so much noise.
But Arduino ecosystem is full of superstition and bizarre hacks. It's cargo cult electronics. They will do anything to avoid reading documentation or writing robust code.
Even the power saving recommendation here reeks of it. There is no effort to understand it. Someone on an Arduino forum recommends it, others start to echo it to try to appear like they know what they're talking about, it becomes lore in the Arduino world and you out yourself as a clueless newbie if you don't know to do esp_wifi_set_ps(WIFI_PS_NONE) without questioning anything because that's just the way it's done. It disables the radio in between AP beacons, so unless there's a bug in the implementation it should have no noticeable impact to a quiet WiFi station other than saving a lot of power.
I used to say things like that, but come on: Arduino is targeted at hobbyists. More specifically, it's targeted at hobbyists who don't want to spend too much time learning hardware. If they did, they would be using a "bare" microcontroller better suited for their needs and costing one tenth the price. But they're not interested in microcontroller programming, they just want to get their art project done.
It's the same thing that happened with computers. Billions of people use them, but most just want to access Facebook or use MS Word, not learn OS internals. It's a different world from where we used to be 30-40 years ago, and that's fine. We design simpler, more intuitive products for them.
If a product meant for that group can't be used effectively by the target audience, I think the fault is with the designer, not with the user.
> If they did, they would be using a "bare" microcontroller better suited for their needs and costing one tenth the price.
Where do you get something like an ESP that's one tenth the price? ESPs are cheap and you can run Arduino, ESP-IDF directly, or fringe environments (I had some ESP8266 running NodeMCU because Lua made more sense to me than Arduino).
You can run Arduino code on anything, since it's mostly just a bit of syntactic sugar around C. But I'm sure you know what I mean.
My point is that people who are attracted to Arduino are, by and large, not the kind of people who want to geek out about the inner workings of the MCU, and there's nothing wrong with that.
I'm pretty familiar with the microprocessor architecture of the 8-bit era that I grew up in, and have done a fair amount of hardware hacking. As things have gotten more complex, I've let some things slide, such as the complexity of pipelined architectures.
Arduino is not even syntactic sugar any more. All it retains of its origins, that I'm aware of, is the weird setup() and loop() schtick. And you have limited control over what happens before your code starts. But with most Arduino compatible boards, you have full access to the vendor supplied libraries, and can go as deep as you want. These days my preferred platform at work is Teensy 4, and at home, the wireless enabled boards. I think Paul Stoffgren is some kind of 100x engineer.
But life is short. Over my 61 years, I've carefully rationed the brain cells that I devote to innards of technologies that will soon be obsolete. I read the Turbo Pascal manuals cover to cover, and The Art Of Electronics, but I never cracked Inside Macintosh. I've decided that I will simply not learn anything about any OS that is not Linux, and superficially at that.
I program desktop computers in high level languages, despite total abstraction of the innards.
I think the relative portability of Arduino code has been a huge boon for hobbyists because it encourages the formation of a community of people who can share code and knowledge, even if they're not all using the same processors, and despite sometimes needing to tweak code when porting it from one platform to another. This was also the case with early FORTRAN. Portability across processors revolutionized scientific computing.
The problem isn't with the artist doing a one-off project involving a microcontroller. It's the Arduino "experts" who write blogs, create videos, and dominate forums with their accumulated nonsense. They posit themselves as authorities in the space, newbies adopt and echo whatever rubbish they make up, and the cycle continues. They get very defensive if you try to correct them, even linking directly to documentation supporting it.
If you're going to write a blog about how the ESP32 doesn't connect to the strongest AP so you need to pin it to a specific BSSID in your router settings... Maybe you shouldn't be writing that blog. If you haven't taken at least a moment to check documentation and see that the behaviour you want is already an option that can be selected by changing literally one line in your ESP32's WiFi config. Instead this pseudoscience proliferates.
Instead of spending x2 the initial effort to fix the root cause, you spend x1 the initial effort to implement jank and then spend x10 the effort down the line maintaining the jank.
Deal with what? I would argue that if you're going to the effort of writing a blog post on the topic then you should at least go to the effort of skimming the docs to make sure there isn't already a solution for the common problem you're experiencing.
It's literally one word to change in his WiFi config to get the behaviour he wants. It's already implemented. Who can't "deal" with that?
Personally, I don't use multiple APs with overlapping SSIDs, but if I did than I can see how it would be easier to deal with the logic from the AP management side rather than the client. It's also nice to not have to re-connect IoT things if/when you add or change your APs.
I think I understand you. That functionality doesn't exist in ESP32 Arduino tool chain without more work/more code. Their hobby level perspective is valuable to other hobby level engineers who want a solution.
> It disables the radio in between AP beacons, so unless there's a bug in the implementation it should have no noticeable impact to a quiet WiFi station other than saving a lot of power.
Seems safe, but it probably depends on the clock being accurate, so it can wake up on time for the next beacon, and the clock frequency is likely sensitive to temperature and therefore power usage.
If you're plugged into a wall wart, chances are the power savings aren't going to be too much; if it helps reliability (which should be easy to confirm), then it's likely worth paying a cent or two more a month. It's different if you're running from battery, though.
To be fair, the API people typically use in hobbyist contexts is literally a single call to 'WiFi.begin(ssid, password)'. There's not exactly any obvious room for error here, and any details which may or may not have been implemented incorrectly are so deep inside abstraction layers as to be inaccessible. There's little apparent room for making the code more robust (other than "workarounds" like application level health checks + reboot on error), because everything is supposed to have been taken care of by the abstraction.
If I can disable PM and then my ESP stops disconnecting from WiFi, I'm happy. There's not much more I can do without re-implementing what 'WiFi.begin()' does myself, and I usually have better things to do with my time.
> It disables the radio in between AP beacons, so unless there's a bug in the implementation it should have no noticeable impact to a quiet WiFi station other than saving a lot of power.
A) this increases ripple voltage which eventually impacts RX noise floor. As long as you have enough headroom at the input to your regulator power saving is great, but eventually having a more consistent load becomes the limiting factor for many devices.
B) drastically increases typical latency - not an issue for all applications, but the ESP-IDF network stack has a Nagler that can't always cleanly be disabled and tends to write each little bit of the next layer to the TCP socket.
A) The timing for this is deliberately set to be very conservative in terms of the wakeup window (at the cost of higher power), so the radio is probably powered up for a good 5ms before the beacon arrives. I don't know if you could unintentionally design a 3V3 supply so poor that it takes in the order of milliseconds to adjust to an output current of about 30mA -> 80mA.
B) Yes, this is a fair point, and why I was careful to specify a "quiet" station above. If actively transmitting then there is likely a benefit to disabling power saving, but unlike Arduino bros I will admit at this point that I don't understand the WiFi spec well enough to comment further with any confidence.
If it's mainly (largely) static IoT devices connecting to an SSID, why not make the SSID hidden so it doesn't fill device screens (controlled by humans who probably don't want the IoT network)? Most commercial IoT products I've used have an option to type a hidden SSID, and your own C++ code definitely can do that.
Good highlight. In commercial deployments absolutely, I'd even recommend going as far as hiding any SSID you don't explicitly want people to be manually clicking on with their own devices (i.e. only guest and/or byod should be visible). Not because of security (as the conversation often tangents into) but because the support tickets for "I can't connect to <wrong SSID>" just go away and it clutters screens less as you say.
For home it has its ups and downs depending how much the user cares about understanding Wi-Fi as part of their project vs just doing the minimum to make their project work. If you've already got a Wi-Fi scanner app you're well acquainted with reading (or are willing to spend a short bit of time reading about how to use one to troubleshoot an issue or select the best covering AP at a location) on one of your devices then you're probably the right crowd to hide the SSID at home as well.
>Good highlight. In commercial deployments absolutely, I'd even recommend going as far as hiding any SSID you don't explicitly want people to be manually clicking on with their own devices (i.e. only guest and/or byod should be visible). Not because of security (as the conversation often tangents into) but because the support tickets for "I can't connect to <wrong SSID>" just go away and it clutters screens less as you say.
Thats bad for the anonymity of all your devices, though. Having a "hidden" network saved and on auto-connect means it'll be constantly broadcasting probe packets for those hidden networks.
Do you know of something that could get my Android tablet to switch between two AP's in my house? When I change locations in the house, it will never change to the stronger one if it has even the weakest signal that it was connected to. I can't find any Android app that tells the device to "change to a new AP if it is stronger than the current one".
Android tends to hang on to APs even if they're at a completely unusable signal level with no connection. Doesn't matter if fast roaming is enabled, doesn't matter if bss transition is enabled.
The only solution I've found is enabling the minimum RSSI feature on my APs, this forcefully disconnects any clients with a low signal.
Devices usually take 5-10 seconds to reconnect to the new AP after this happens, but will also sometimes will fail outright for a minute or more while Android insists on trying to connect to the old AP with a worse signal and keeps getting kicked off, before it finally gives up and connects to the stronger signal.
Raising the minimum data rate can also help, as long as none of your devices are really old and need them.
I haven't had an Android device for a number of years now but my guess is "no" from the device side unless you're going to go make your own custom firmware with Wi-Fi settings tweaked or newer Wi-Fi drivers which handle roaming better.
If your router and device support 802.11 k/v/r it can help, most likely the AP does unless it's particularly ancient and Android has supported all of these since 8.0. Ironically, lowering the AP's power can help too (it'll cause the client to see the farther AP as weaker sooner) but obviously that lowers overall coverage at the same time... so it's a tradeoff unless you're willing to deploy more APs to make up for the lower power. Make sure you're restricting advertised rates from your AP to not include the slowest/oldest standards as well. That'll make the client less able to hold on as the connection gets weaker and weaker. Like the power recommendation, this means at the fringes of your coverage you'll get "no connection" rather than "a bad connection" but it'll make the overall airspace healthier.
If nothing else works and you can't fix the client because you lack full control (or a reasonable way to update it) then you can still try falling back to steering clients via hints from the AP (if it supports it) just keep in mind it also may not work and also may cause problems with other devices which were working fine. Or you may get lucky, worth a shot if you've tried everything else first.
As a note: My experience comes from designing and fixing enterprise wireless deployments so if there are any tricks specific to low AP count environments I would be somewhat ignorant of them. The same could be said if there are more easily accessible wireless controls in Android than I am familiar with as, if there are, I still couldn't use them as different guests walk in each day and they all need to work.
Yes. The proven method is to A/restrict data rates to just high ones and B/lower the transmit power of your AP so that your devices can no longer maintain high enough data rates after a given point and are forced to disconnect. I am almost certain the run of the mill stock firmware does not offer you this option. Look into installing OpenWrt.
Apart from that, OpenWrt now allows you to install and use usteer which offers a plethora of (802.11 standards-based) tools to manage client roaming from the AP side, including the APs exchanging information between themselves.
I use Fresh Tomato as the firmware on my router, but I don't see anything in the documentation about "usteer". I'll search for a different term. I'm sure they have something, because they have a lot of active development.
I do actually see the problem that the ESP32 doesn't automatically reconnect to the stronger AP. I think this gets triggered when then stronger AP is briefly unavailable (reboot or radar scan or whatever) and it switches to the weaker AP, but then once the stronger AP is back it stays connected to the weak AP. (This is with multiple APs in a mesh configuration)
> ESP32 doesn't automatically reconnect to the stronger AP
How would it know to reconnect to stronger AP?
You can order it to do a background scan and reconnect to stronger AP if you want, but you have to figure out how often to do this and how to interleave other data during the scan.
Funny how tweaking modern tech can still feel like adjusting rabbit ears on grandma's old TV - I swear my ESP32 runs better when it's cloudy outside or the microwave’s off.
I wonder if the attempts to replace parts of the network stack with FOSS ( https://news.ycombinator.com/item?id=38550026 ) would help with this? At the very least it would let you get more visibility into what's going on when things don't work, and in the best case it'd let you replace the firmware that's creating problems with firmware that doesn't have those problems. (I think.) Of course this depends on the bits you replace being the parts in play.
This is not a superstition for 2.4GHz. Other devices sharing 2.4ghz tend to malfunction when 40MHz channels are in use. This is a well known phenomenon. I have been able to reproduce it in my household by enabling 40MHz channel support for 2.4GHz on my ruckus APs while listening to audio from a Bluetooth speaker. The Bluetooth speaker will start having discontinuities in the audio stream that disappear when 40MHz support in the APs is disabled.
2.4Ghz WiFi networks should use 20Mhz bands generally. No idea about specifically with ESP32s, but this is good guidance for not congesting the 2.4Ghz space.
Other things using the 2.4GHz spectrum might have malfunctioned because of that and the users would have had no recourse. Even if you can, you should not do this unless there are no neighbors within range and you can ensure that you will never put anything else on 2.4GHz near it.
Also, I do not believe iOS ever added support for 40MHz channels on 2.4GHz. That feature never should have been added to WiFi. It causes headaches practically everywhere it is used.
I second that. In my experience forcing a 40MHz channel on a 2.4g network leads to abysmal performance (never tried it with ESP32, however I can imagine it would be even worse...).
I've had pretty good luck with the ESP line and WiFi, even have 2 of the 32s that have the port for external antenna working in sheds about 150' from the nearest AP.
I've always turned off power saving and given them hard coded IP settings (no DHCP). The only time I saw anything wonky is a short-lived experiment with the 32's deep sleep not reliably connecting on wakeup.
Mostly ESPHome now, two of the basic Uniquiti pucks from a few years back, for reference.
This isn’t my area of expertise but I’m surprised there’s no mention of errata. Is that just not a factor at this point because the firmware is assumed to be mature enough?
One of the culprits I encountered is that some of the IoT WiFi chips cannot do proper authentication if the AP is using WPA2/WPA3 mixed mode (very common for WiFi6 to increase security with modern devices while keeping backwards-compatibility). The flawed chips can initially connect successfully but disconnect after a few days.
Solved it by creating a dedicated IoT SSID advertising just WPA2 w/ AES and only on 2.4GHz w/ 20MHz (you shouldn't use 40MHz on 2.4GHz anyway for god's sake).
I did a little dabbling in code for the ESP32 so I can't speak authoritatively at all, but I found that if your project does not require Wi-Fi, it was kind of annoying that the ESP32 would rob me of background cycles that, for example, the Teensy did not.
Maybe there was a way to turn off the Wi-Fi stack. But I suppose the general point still stands — that there are plenty of other small devices without Wi-Fi if that is not a requirement.
I created some sensor boards based on the ESP8266, they worked well for a while but lately they've been getting flaky. I'm a mediocre hardware engineer at best, so if anyone has any tips on how to make my board more stable, they would be appreciated.
Here's the board, it's a bit of a generic light/motion/presence sensor with an IR LED:
Do you know which aspect is getting flaky? Ie. Is the wifi dropping out temporarily, or permanently, or the whole MCU locking up, or the 3.3V browning out, or…?
I'm not entirely sure, I can't debug because I don't have a serial interface to it. The board just stops responding to MQTT commands. I do see a lot of "connecting to MQTT" debug messages over the network, so I assume it does have WiFi. I should try pinging and see if it disconnects.
There’s your first lesson: Always have a debug header with serial pins on your board. ;) It should be easy enough to solder some wires onto the ESP32’s serial port… although wait, how are you programming it?
It’s also good to have a couple of status LEDs and a power indicator on any custom board. Then you can add blink codes to indicate errors etc.
For people who are just trying to get a sensor to work, or control some lights, most esp32 frameworks surface maximum functionality, but minimum system state and next to no detailed debugging capabilities.
Combine that with very few “knobs” to turn on the esp32 framework, a router/wifi AP that is likely hostile to its owners, and well, superstition is pretty much all that is left.
It’s a demon haunted world.
Take note, those who dream of llms operating everything include each other: “prompt engineering” is just a fancy euphemism for grimoire, and when it’s all massive, inscrutable matrices deciding everything, well, you too might just feel it’s not a bad idea to build a shrine.
>It seems that when an ESP32 connects it goes straight for the first access point it sees. No matter if that access point is not the one you’ve taped it to. This can lead to bad connectivity, especially since I’ve not really observed ESP32’s moving around to other access points.
This is not a characteristic of "ESP32", this is how the software running on the ESP32 is programmed to work, whatever specific program that is. And it is not my experience at all with "ESP32s". In my ESP-IDF based program, the ESP32 device connects to access points exactly how I tell it to connect with the software I wrote that deals with wifi connections. YMMV.
I encountered this a while back and it led me to dig into the ESP-IDF documentation to try to understand the behavior of a device I did not write the code for. Yes, it's software, but it's a footgun from the manufacturer. This in particular I find to violate the principle of least astonishment:
> It is a possible situation that there are multiple APs that match the target AP info, e.g., two APs with the SSID of "ap" are scanned. In this case, if the scan is WIFI_FAST_SCAN, then only the first scanned "ap" will be found.
The default if esp_wifi_set_config() is not called or, as I see in almost all the sample code that comes up with a quick web search, it is left untouched in the wifi_config_t struct, appears to be WIFI_FAST_SCAN. When you're looking directly at the relevant manual section during a HN discussion of ESP WiFI behavior, it may be obvious, but for a developer focused on the main product functionality who just copies and pastes an example and moves on when it appears to work, I'm not surprised if this always-incorrect-by-default behavior makes it into the vast majority of shipped ESP-based products.
I could have sworn there used to be a "sort by SSID" default, as in it would do the full scan and then connect to the AP with the alphabetically earlier hardware address. In any case, the symptom I was plagued with was that this particular device would consistently connect to the furthest-away access point rather than the one in the same room, resulting in unusable dropouts.
They wouldn't help at all here. Bluetooth is not even mentioned and there's no HCI so they're not even accessible, and he clearly controls the device anyway so can much more simply modify the source and reflash.
Props to OP for using the term 'superstitions'. Wi-fi (and radio in general) is poorly understood even by people with engineering degrees, let alone the average Joe. This often leads to trying random stuff until an improvement is perceived (or believed to be...), then spreading this experience as proven and applicable in any context with no further research and confirmation of the results (let alone filing bug reports).
The prominence and accessibility (to laymen) of wifi does not help the situation either... I mean what else CAN they do except try stuff out, how are they going to determine that it worked and why it did?...
OP, yes, poorly implemented power saving has in my experience often been a culprit behind network reliability issues. That said, please consider adding a disclaimer that just disabling power saving in client devices without pinpointing the root cause of the instability or at least reporting it is exactly why we are in a situation where it is still buggy.
I love the term "superstition". I like to refer to lab "rituals" - if something works, repeat it, and never change things around "because it ought to work just as well". Life is too short to figure out all the unknown variables that might be affecting things, so seize luck when you find it.
Having a protocol for working with a black box is fine. The problem is how bad a lot of people are at differentiating between superstition, a black box, and a process with a well understood mechanism.
Same as Cargo Cult programming: https://en.wikipedia.org/wiki/Cargo_cult_programming
I’ve been referring to this kind of belief as a ‘pigeon religion.’ This one time an operator has a problem (it’s an e-stop that hasn’t been reset) and they ‘fixed it’ by switching the system into test mode and back, then pressing the e-stop reset three times. So they tell all their friends.
Next time that doesn’t work (the e-stop button is still pressed in) but someone tries switching the system into test mode and back twice, slamming the control cabinet door, pressing and releasing the e-stop button three times, then pressing the e-stop reset button three times. They report their findings on the new standard procedure…
And so on.
I caught a friend's mother going through a ritual like this with the dishwasher. She had to press the heat mode or something on the dishwasher four times in a row before starting it (so it toggled it on and off twice). Who knows how this got started, but she was adamant that it had to be done before starting it.
> then spreading this experience as proven and applicable in any context
It's worse. People don't even need to have personal experience to spread misinformation. They just read it online and then spread it like gospel.
Misinformation in the likes of "Wood glue it stronger than wood", "Burning Diesel/Kerosine can't reach temperatures required to melt steel".
Is wood glue really not stronger than wood?
If you have an ESP32C3 board with poor wifi performance, check if it's a boad with poor design that has the ceramic antenna is too close to the oscillator: https://roryhay.es/blog/esp32-c3-super-mini-flaw
YMMV but adding a loop of wire in parallel with the oscillator improved wifi performance on mine.
(I first tried moving the ceramic antenna a millimeter outward, but that made no difference.)
I was having multiple ESP devices from different brands running totally different firmwares all drop out randomly when I switched to a new Asus wifi router.
Came across even more 'work arounds'; No spaces in the SSID, disable IPv6 for the whole network even if the ESP ignores it. Thing is all of these settings would reboot the router and reconnect everything, so it would seemingly work until the next dropout.
I found limiting them to 802.11g instead of connecting with 'n' stopped the dropouts for good. Even now I wouldn't say that is a cure-all and that any of these other recommendations don't work, I'd guess that each AP's firmware might have different conflicts with different devices.
This one I've seen caused by N+ burst transmissions tanking the power rail, which causes acknowledgements to be dropped, which causes the esp to lower its TX rate so the TX takes longer...
> If your network hardware allows it, you should pin the device to the closest one.
In Wi-Fi it's always the client's choice on where to connect to at the end of the day and any hacks the APs try to do to steer clients are "suggestions" at best and "signal ruiners for everyone" at worst.
You may be better off specifying which specific AP you want to connect to by specifying the BSSID argument in the WiFi.begin() call on the ESP32 side https://github.com/espressif/arduino-esp32/blob/master/libra...
Alternatively, if you REALLY want to force it, the sanest way is to use a uniquely named IoT SSID on each AP so there would be no other option for those clients to choose to latch on to (and you can leave the other SSIDs shared for normal clients). E.g. "IoT-1", "IoT-2", "IoT-3" on 3 separate APs. It may clutter up device screens more when you list available networks but it's just visual because, as far as airtime, they all beacon just as often if the names are the same or not anyways.
> From what people and the internet tell me you should set the band width on the 2.4 Ghz network that your boards use to 20 Mhz, not 40, not 60, and definitely not automatic.
This is spot on in that 20 MHz is the ideal channel width on 2.4, doubly so for just an ESP32. Some things to add are I'd say it should apply to any of your 2.4 GHz networks unless you live out the sticks and really want to race a few extra mbps out at the fringe of your AP coverage (and even then the wider channel width is probably going to make your SNR worse even in the sticks). Also, I don't believe the ESP32 supports 60 MHz in the 2.4 GHz space at all (it's certainly not an option in the Wi-Fi standard).
I'll also tack on that the ESP32-C6 can be worth springing for if these kinds of thing are a particular concern as it support Wi-Fi 6, which has a few enhancements for connecting lots of IoT devices without so much noise.
Yes this is a skill issue.
But Arduino ecosystem is full of superstition and bizarre hacks. It's cargo cult electronics. They will do anything to avoid reading documentation or writing robust code.
Even the power saving recommendation here reeks of it. There is no effort to understand it. Someone on an Arduino forum recommends it, others start to echo it to try to appear like they know what they're talking about, it becomes lore in the Arduino world and you out yourself as a clueless newbie if you don't know to do esp_wifi_set_ps(WIFI_PS_NONE) without questioning anything because that's just the way it's done. It disables the radio in between AP beacons, so unless there's a bug in the implementation it should have no noticeable impact to a quiet WiFi station other than saving a lot of power.
I used to say things like that, but come on: Arduino is targeted at hobbyists. More specifically, it's targeted at hobbyists who don't want to spend too much time learning hardware. If they did, they would be using a "bare" microcontroller better suited for their needs and costing one tenth the price. But they're not interested in microcontroller programming, they just want to get their art project done.
It's the same thing that happened with computers. Billions of people use them, but most just want to access Facebook or use MS Word, not learn OS internals. It's a different world from where we used to be 30-40 years ago, and that's fine. We design simpler, more intuitive products for them.
If a product meant for that group can't be used effectively by the target audience, I think the fault is with the designer, not with the user.
> If they did, they would be using a "bare" microcontroller better suited for their needs and costing one tenth the price.
Where do you get something like an ESP that's one tenth the price? ESPs are cheap and you can run Arduino, ESP-IDF directly, or fringe environments (I had some ESP8266 running NodeMCU because Lua made more sense to me than Arduino).
You can run Arduino code on anything, since it's mostly just a bit of syntactic sugar around C. But I'm sure you know what I mean.
My point is that people who are attracted to Arduino are, by and large, not the kind of people who want to geek out about the inner workings of the MCU, and there's nothing wrong with that.
There's always a few of us. ;-)
I'm pretty familiar with the microprocessor architecture of the 8-bit era that I grew up in, and have done a fair amount of hardware hacking. As things have gotten more complex, I've let some things slide, such as the complexity of pipelined architectures.
Arduino is not even syntactic sugar any more. All it retains of its origins, that I'm aware of, is the weird setup() and loop() schtick. And you have limited control over what happens before your code starts. But with most Arduino compatible boards, you have full access to the vendor supplied libraries, and can go as deep as you want. These days my preferred platform at work is Teensy 4, and at home, the wireless enabled boards. I think Paul Stoffgren is some kind of 100x engineer.
But life is short. Over my 61 years, I've carefully rationed the brain cells that I devote to innards of technologies that will soon be obsolete. I read the Turbo Pascal manuals cover to cover, and The Art Of Electronics, but I never cracked Inside Macintosh. I've decided that I will simply not learn anything about any OS that is not Linux, and superficially at that.
I program desktop computers in high level languages, despite total abstraction of the innards.
I think the relative portability of Arduino code has been a huge boon for hobbyists because it encourages the formation of a community of people who can share code and knowledge, even if they're not all using the same processors, and despite sometimes needing to tweak code when porting it from one platform to another. This was also the case with early FORTRAN. Portability across processors revolutionized scientific computing.
The problem isn't with the artist doing a one-off project involving a microcontroller. It's the Arduino "experts" who write blogs, create videos, and dominate forums with their accumulated nonsense. They posit themselves as authorities in the space, newbies adopt and echo whatever rubbish they make up, and the cycle continues. They get very defensive if you try to correct them, even linking directly to documentation supporting it.
If you're going to write a blog about how the ESP32 doesn't connect to the strongest AP so you need to pin it to a specific BSSID in your router settings... Maybe you shouldn't be writing that blog. If you haven't taken at least a moment to check documentation and see that the behaviour you want is already an option that can be selected by changing literally one line in your ESP32's WiFi config. Instead this pseudoscience proliferates.
I know what you mean haha.
Instead of spending x2 the initial effort to fix the root cause, you spend x1 the initial effort to implement jank and then spend x10 the effort down the line maintaining the jank.
Who wants to deal with writing that logic for a hobby project? That doesn't sound fun at all.
Sounds like a "good enough" shitty solution to me, which is kind of the whole point of DIY.
Deal with what? I would argue that if you're going to the effort of writing a blog post on the topic then you should at least go to the effort of skimming the docs to make sure there isn't already a solution for the common problem you're experiencing.
It's literally one word to change in his WiFi config to get the behaviour he wants. It's already implemented. Who can't "deal" with that?
Personally, I don't use multiple APs with overlapping SSIDs, but if I did than I can see how it would be easier to deal with the logic from the AP management side rather than the client. It's also nice to not have to re-connect IoT things if/when you add or change your APs.
I'm not sure we're understanding each other so just to be clear, my suggestion is to change from this (pseudo) C:
to It's that simple.I think I understand you. That functionality doesn't exist in ESP32 Arduino tool chain without more work/more code. Their hobby level perspective is valuable to other hobby level engineers who want a solution.
> It disables the radio in between AP beacons, so unless there's a bug in the implementation it should have no noticeable impact to a quiet WiFi station other than saving a lot of power.
Seems safe, but it probably depends on the clock being accurate, so it can wake up on time for the next beacon, and the clock frequency is likely sensitive to temperature and therefore power usage.
If you're plugged into a wall wart, chances are the power savings aren't going to be too much; if it helps reliability (which should be easy to confirm), then it's likely worth paying a cent or two more a month. It's different if you're running from battery, though.
To be fair, the API people typically use in hobbyist contexts is literally a single call to 'WiFi.begin(ssid, password)'. There's not exactly any obvious room for error here, and any details which may or may not have been implemented incorrectly are so deep inside abstraction layers as to be inaccessible. There's little apparent room for making the code more robust (other than "workarounds" like application level health checks + reboot on error), because everything is supposed to have been taken care of by the abstraction.
If I can disable PM and then my ESP stops disconnecting from WiFi, I'm happy. There's not much more I can do without re-implementing what 'WiFi.begin()' does myself, and I usually have better things to do with my time.
> It disables the radio in between AP beacons, so unless there's a bug in the implementation it should have no noticeable impact to a quiet WiFi station other than saving a lot of power.
A) this increases ripple voltage which eventually impacts RX noise floor. As long as you have enough headroom at the input to your regulator power saving is great, but eventually having a more consistent load becomes the limiting factor for many devices.
B) drastically increases typical latency - not an issue for all applications, but the ESP-IDF network stack has a Nagler that can't always cleanly be disabled and tends to write each little bit of the next layer to the TCP socket.
A) The timing for this is deliberately set to be very conservative in terms of the wakeup window (at the cost of higher power), so the radio is probably powered up for a good 5ms before the beacon arrives. I don't know if you could unintentionally design a 3V3 supply so poor that it takes in the order of milliseconds to adjust to an output current of about 30mA -> 80mA.
B) Yes, this is a fair point, and why I was careful to specify a "quiet" station above. If actively transmitting then there is likely a benefit to disabling power saving, but unlike Arduino bros I will admit at this point that I don't understand the WiFi spec well enough to comment further with any confidence.
Not to mention the TX power ramps up in microseconds to quite a lot more than 80 mA.
If your supply can't handle the modem sleep mode it definitely isn't going to transmit reliably either.
If it's mainly (largely) static IoT devices connecting to an SSID, why not make the SSID hidden so it doesn't fill device screens (controlled by humans who probably don't want the IoT network)? Most commercial IoT products I've used have an option to type a hidden SSID, and your own C++ code definitely can do that.
Good highlight. In commercial deployments absolutely, I'd even recommend going as far as hiding any SSID you don't explicitly want people to be manually clicking on with their own devices (i.e. only guest and/or byod should be visible). Not because of security (as the conversation often tangents into) but because the support tickets for "I can't connect to <wrong SSID>" just go away and it clutters screens less as you say.
For home it has its ups and downs depending how much the user cares about understanding Wi-Fi as part of their project vs just doing the minimum to make their project work. If you've already got a Wi-Fi scanner app you're well acquainted with reading (or are willing to spend a short bit of time reading about how to use one to troubleshoot an issue or select the best covering AP at a location) on one of your devices then you're probably the right crowd to hide the SSID at home as well.
>Good highlight. In commercial deployments absolutely, I'd even recommend going as far as hiding any SSID you don't explicitly want people to be manually clicking on with their own devices (i.e. only guest and/or byod should be visible). Not because of security (as the conversation often tangents into) but because the support tickets for "I can't connect to <wrong SSID>" just go away and it clutters screens less as you say.
Thats bad for the anonymity of all your devices, though. Having a "hidden" network saved and on auto-connect means it'll be constantly broadcasting probe packets for those hidden networks.
Do you know of something that could get my Android tablet to switch between two AP's in my house? When I change locations in the house, it will never change to the stronger one if it has even the weakest signal that it was connected to. I can't find any Android app that tells the device to "change to a new AP if it is stronger than the current one".
Android tends to hang on to APs even if they're at a completely unusable signal level with no connection. Doesn't matter if fast roaming is enabled, doesn't matter if bss transition is enabled.
The only solution I've found is enabling the minimum RSSI feature on my APs, this forcefully disconnects any clients with a low signal.
Devices usually take 5-10 seconds to reconnect to the new AP after this happens, but will also sometimes will fail outright for a minute or more while Android insists on trying to connect to the old AP with a worse signal and keeps getting kicked off, before it finally gives up and connects to the stronger signal.
Raising the minimum data rate can also help, as long as none of your devices are really old and need them.
I haven't had an Android device for a number of years now but my guess is "no" from the device side unless you're going to go make your own custom firmware with Wi-Fi settings tweaked or newer Wi-Fi drivers which handle roaming better.
If your router and device support 802.11 k/v/r it can help, most likely the AP does unless it's particularly ancient and Android has supported all of these since 8.0. Ironically, lowering the AP's power can help too (it'll cause the client to see the farther AP as weaker sooner) but obviously that lowers overall coverage at the same time... so it's a tradeoff unless you're willing to deploy more APs to make up for the lower power. Make sure you're restricting advertised rates from your AP to not include the slowest/oldest standards as well. That'll make the client less able to hold on as the connection gets weaker and weaker. Like the power recommendation, this means at the fringes of your coverage you'll get "no connection" rather than "a bad connection" but it'll make the overall airspace healthier.
If nothing else works and you can't fix the client because you lack full control (or a reasonable way to update it) then you can still try falling back to steering clients via hints from the AP (if it supports it) just keep in mind it also may not work and also may cause problems with other devices which were working fine. Or you may get lucky, worth a shot if you've tried everything else first.
As a note: My experience comes from designing and fixing enterprise wireless deployments so if there are any tricks specific to low AP count environments I would be somewhat ignorant of them. The same could be said if there are more easily accessible wireless controls in Android than I am familiar with as, if there are, I still couldn't use them as different guests walk in each day and they all need to work.
Yes. The proven method is to A/restrict data rates to just high ones and B/lower the transmit power of your AP so that your devices can no longer maintain high enough data rates after a given point and are forced to disconnect. I am almost certain the run of the mill stock firmware does not offer you this option. Look into installing OpenWrt.
Apart from that, OpenWrt now allows you to install and use usteer which offers a plethora of (802.11 standards-based) tools to manage client roaming from the AP side, including the APs exchanging information between themselves.
I use Fresh Tomato as the firmware on my router, but I don't see anything in the documentation about "usteer". I'll search for a different term. I'm sure they have something, because they have a lot of active development.
> dedicated for routers with Broadcom chipset
You see, the first thing I look for when I want to buy a new AP is for it to NOT have a Broadcom chipset (https://openwrt.org/meta/infobox/broadcom_wifi)
Search for Wifi roaming. To support Wifi roaming you need a wifi mesh though.
Mesh is not required at all to use 802.11r
>It seems that when an ESP32 connects it goes straight for the first access point it sees.
No! You as programmer control that. You can configure to connect to any AP you want.
My code does a scan and save the closest AP. If it can’t connect it does another scan and saves a new AP
I think they're claiming that it won't roam between APs with identical SSIDs.
It’s up to you as the programmer! The library literally gives you the power to roam/connect to whatever you want whenever you want.
Yea, it's not super clear but that's indeed what I meant to say :)
Then change your code.
Every AP has a bssid (MAC address) that you can use to connect to specific AP.
It’s up to the code to figure out which one to connect to. The libraries have all the options.
When you do a scan you get bssid of the AP and strength of each signal. You can make a determination of when to rescan and reconnect.
I do actually see the problem that the ESP32 doesn't automatically reconnect to the stronger AP. I think this gets triggered when then stronger AP is briefly unavailable (reboot or radar scan or whatever) and it switches to the weaker AP, but then once the stronger AP is back it stays connected to the weak AP. (This is with multiple APs in a mesh configuration)
> ESP32 doesn't automatically reconnect to the stronger AP
How would it know to reconnect to stronger AP?
You can order it to do a background scan and reconnect to stronger AP if you want, but you have to figure out how often to do this and how to interleave other data during the scan.
Funny how tweaking modern tech can still feel like adjusting rabbit ears on grandma's old TV - I swear my ESP32 runs better when it's cloudy outside or the microwave’s off.
I wonder if the attempts to replace parts of the network stack with FOSS ( https://news.ycombinator.com/item?id=38550026 ) would help with this? At the very least it would let you get more visibility into what's going on when things don't work, and in the best case it'd let you replace the firmware that's creating problems with firmware that doesn't have those problems. (I think.) Of course this depends on the bits you replace being the parts in play.
> Set your APs to use 20 Mhz wide channels
This is not a superstition for 2.4GHz. Other devices sharing 2.4ghz tend to malfunction when 40MHz channels are in use. This is a well known phenomenon. I have been able to reproduce it in my household by enabling 40MHz channel support for 2.4GHz on my ruckus APs while listening to audio from a Bluetooth speaker. The Bluetooth speaker will start having discontinuities in the audio stream that disappear when 40MHz support in the APs is disabled.
2.4Ghz WiFi networks should use 20Mhz bands generally. No idea about specifically with ESP32s, but this is good guidance for not congesting the 2.4Ghz space.
I can think of only a single time where 40 megahertz on 2.4 gigahertz was the right choice for a product (designed for outdoor use, battery powered).
File download time decreased by about 40% on Android but not at all on iOS - turned out at the time they only supported 40MHz+ on 5.8GHz.
Other things using the 2.4GHz spectrum might have malfunctioned because of that and the users would have had no recourse. Even if you can, you should not do this unless there are no neighbors within range and you can ensure that you will never put anything else on 2.4GHz near it.
Also, I do not believe iOS ever added support for 40MHz channels on 2.4GHz. That feature never should have been added to WiFi. It causes headaches practically everywhere it is used.
I second that. In my experience forcing a 40MHz channel on a 2.4g network leads to abysmal performance (never tried it with ESP32, however I can imagine it would be even worse...).
[dead]
I've had pretty good luck with the ESP line and WiFi, even have 2 of the 32s that have the port for external antenna working in sheds about 150' from the nearest AP.
I've always turned off power saving and given them hard coded IP settings (no DHCP). The only time I saw anything wonky is a short-lived experiment with the 32's deep sleep not reliably connecting on wakeup.
Mostly ESPHome now, two of the basic Uniquiti pucks from a few years back, for reference.
This isn’t my area of expertise but I’m surprised there’s no mention of errata. Is that just not a factor at this point because the firmware is assumed to be mature enough?
One of the culprits I encountered is that some of the IoT WiFi chips cannot do proper authentication if the AP is using WPA2/WPA3 mixed mode (very common for WiFi6 to increase security with modern devices while keeping backwards-compatibility). The flawed chips can initially connect successfully but disconnect after a few days.
Solved it by creating a dedicated IoT SSID advertising just WPA2 w/ AES and only on 2.4GHz w/ 20MHz (you shouldn't use 40MHz on 2.4GHz anyway for god's sake).
Here're some useful tips to tune WiFi for IoT https://www.wiisfi.com/#iotconnect
Thanks, I've written that one down in my notes as well it definitely fits the superstitions because it sounds like it should work well.
I did a little dabbling in code for the ESP32 so I can't speak authoritatively at all, but I found that if your project does not require Wi-Fi, it was kind of annoying that the ESP32 would rob me of background cycles that, for example, the Teensy did not.
Maybe there was a way to turn off the Wi-Fi stack. But I suppose the general point still stands — that there are plenty of other small devices without Wi-Fi if that is not a requirement.
I note there is also an open source WiFi firmare for ESP32:
https://esp32-open-mac.be/
I created some sensor boards based on the ESP8266, they worked well for a while but lately they've been getting flaky. I'm a mediocre hardware engineer at best, so if anyone has any tips on how to make my board more stable, they would be appreciated.
Here's the board, it's a bit of a generic light/motion/presence sensor with an IR LED:
https://gitlab.com/stavros/sensor-board
Do you know which aspect is getting flaky? Ie. Is the wifi dropping out temporarily, or permanently, or the whole MCU locking up, or the 3.3V browning out, or…?
I'm not entirely sure, I can't debug because I don't have a serial interface to it. The board just stops responding to MQTT commands. I do see a lot of "connecting to MQTT" debug messages over the network, so I assume it does have WiFi. I should try pinging and see if it disconnects.
There’s your first lesson: Always have a debug header with serial pins on your board. ;) It should be easy enough to solder some wires onto the ESP32’s serial port… although wait, how are you programming it?
It’s also good to have a couple of status LEDs and a power indicator on any custom board. Then you can add blink codes to indicate errors etc.
I programmed it once and then flash OTA, I couldn't add LEDs because they messed skfh fbe photodiode's readings.
Use osc instead
What's that?
@supakeen found a spelling mistake at the end of the first para
> seriously, open up a device and changes are relatively large that you’ll find one
s/changes/chances
Thanks, fixed!
For people who are just trying to get a sensor to work, or control some lights, most esp32 frameworks surface maximum functionality, but minimum system state and next to no detailed debugging capabilities.
Combine that with very few “knobs” to turn on the esp32 framework, a router/wifi AP that is likely hostile to its owners, and well, superstition is pretty much all that is left.
It’s a demon haunted world.
Take note, those who dream of llms operating everything include each other: “prompt engineering” is just a fancy euphemism for grimoire, and when it’s all massive, inscrutable matrices deciding everything, well, you too might just feel it’s not a bad idea to build a shrine.
>It seems that when an ESP32 connects it goes straight for the first access point it sees. No matter if that access point is not the one you’ve taped it to. This can lead to bad connectivity, especially since I’ve not really observed ESP32’s moving around to other access points.
This is not a characteristic of "ESP32", this is how the software running on the ESP32 is programmed to work, whatever specific program that is. And it is not my experience at all with "ESP32s". In my ESP-IDF based program, the ESP32 device connects to access points exactly how I tell it to connect with the software I wrote that deals with wifi connections. YMMV.
I encountered this a while back and it led me to dig into the ESP-IDF documentation to try to understand the behavior of a device I did not write the code for. Yes, it's software, but it's a footgun from the manufacturer. This in particular I find to violate the principle of least astonishment:
https://docs.espressif.com/projects/esp-idf/en/latest/esp32/...
> It is a possible situation that there are multiple APs that match the target AP info, e.g., two APs with the SSID of "ap" are scanned. In this case, if the scan is WIFI_FAST_SCAN, then only the first scanned "ap" will be found.
The default if esp_wifi_set_config() is not called or, as I see in almost all the sample code that comes up with a quick web search, it is left untouched in the wifi_config_t struct, appears to be WIFI_FAST_SCAN. When you're looking directly at the relevant manual section during a HN discussion of ESP WiFI behavior, it may be obvious, but for a developer focused on the main product functionality who just copies and pastes an example and moves on when it appears to work, I'm not surprised if this always-incorrect-by-default behavior makes it into the vast majority of shipped ESP-based products.
And there's also
https://docs.espressif.com/projects/esp-idf/en/latest/esp32/...
I could have sworn there used to be a "sort by SSID" default, as in it would do the full scan and then connect to the AP with the alphabetically earlier hardware address. In any case, the symptom I was plagued with was that this particular device would consistently connect to the furthest-away access point rather than the one in the same room, resulting in unusable dropouts.
I'm sure some of these newly discovered undocumented commands could help.
https://www.tarlogic.com/news/backdoor-esp32-chip-infect-ot-...
They wouldn't help at all here. Bluetooth is not even mentioned and there's no HCI so they're not even accessible, and he clearly controls the device anyway so can much more simply modify the source and reflash.