Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Rewrite sntp in Lua with only a little C #2819

Open
wants to merge 1 commit into
base: dev
Choose a base branch
from

Conversation

nwf
Copy link
Member

@nwf nwf commented Jul 5, 2019

  • This PR is for the dev branch rather than for master.
  • This PR is compliant with the other contributing guidelines as well (if not, please describe why).
  • I have thoroughly tested my contribution.
  • The code changes are reflected in the documentation at docs/*.

This PR contains a new sntp module with almost all the state machine done in Lua and only the packet processing and fiddly large-integer math done in C. It is intended to supplant #2700 and the in-C in-progress rewrite I mentioned therein.

At present, it is exceptionally naive about time itself, even if its state machine endeavors to be more robustly implemented; @jmattsson's existing C displays a much more thorough understanding of time keeping than I possess. It seems likely that we will need, and I am not opposed to adding, additional userdata structures for persistent state in the sntppkt C half.

@jmattsson
Copy link
Member

I'll try to have a look late this week / next week!

@TerryE
Copy link
Collaborator

TerryE commented Jul 9, 2019

Nathaniel, If you don't mind, I'll give a general review rather than one anchored to PR lines, Though I will add a few specific anchored comments.

  • I find the overall concept of using hybrid modules an extremely interesting one and one that is really built around utilising LFS. This allows us to keep the C part minimalist and really private to the public Lua module which keeps the bulk of the login in a form that is accessible and extensible to the application programmer. With LFS this can incur to RAM penalty and also allow the developer to move decisions about whether SNTP is needed to LFS image load. I think that this is well worth exploring.

  • Is it worth making the C module sntppkt truly private (or not easily accessible from general Lua) because if it is only used by the "trusted" SNTP module then we can slim down on some of the validation and error checking. One way of doing this would not to publicly document this C helper and also give it a non-compliant Lua variable name, e.g. sntp-helper,c. The SNTP constructor could still look up this up using the ROM['sntp-helper,c'] access path and stash this in a shared upval, and also throw an explicit error "You must include the SNTP_HELPER module" in your build to use SNTP" if this is nil.

  • BTW, I strongly recommend recording all upvals used in each Lua function in its declaration e.g. with -- upvals: x, y, z as I do in my code, especially when said upvals are used RW within the routine and can therefore create side-effects. You can find these by mining the -l listing.

  • Is the SNTP object intended to be a singleton or not? (I see that it is currently written as mutli-instance -- that is the application can create multiple SNTP servers). The reason that I ask this is that if it makes more sense to make it a singleton then the locals used as upvals / internal functions can be hoisted one level rather than including them inside new().

lua_modules/sntp/sntp.lua Outdated Show resolved Hide resolved
lua_modules/sntp/sntp.lua Outdated Show resolved Hide resolved
app/modules/sntppkt.c Outdated Show resolved Hide resolved
@nwf
Copy link
Member Author

nwf commented Jul 10, 2019

Nathaniel, If you don't mind, I'll give a general review rather than one anchored to PR lines, Though I will add a few specific anchored comments.

Any and all commentary is more than welcome. :)

  • I find the overall concept of using hybrid modules an extremely interesting one and one that is really built around utilising LFS. This allows us to keep the C part minimalist and really private to the public Lua module which keeps the bulk of the login in a form that is accessible and extensible to the application programmer. With LFS this can incur to RAM penalty and also allow the developer to move decisions about whether SNTP is needed to LFS image load. I think that this is well worth exploring.

I confess the intent was not so much to take advantage of LFS as it was to get rid of some hairy C, but yes, LFS certainly makes this kind of design much more attractive. LFS really does change the programming landscape for the project.

  • Is it worth making the C module sntppkt truly private (or not easily accessible from general Lua) because if it is only used by the "trusted" SNTP module then we can slim down on some of the validation and error checking. One way of doing this would not to publicly document this C helper and also give it a non-compliant Lua variable name, e.g. sntp-helper,c. The SNTP constructor could still look up this up using the ROM['sntp-helper,c'] access path and stash this in a shared upval, and also throw an explicit error "You must include the SNTP_HELPER module" in your build to use SNTP" if this is nil.

Oh, that's an intriguing idea. The sntppkt module is, at the moment, stateless, so there's presently no harm in exposing it, but I'm not planning on documenting it (heavily). I'm glad to know that private-ish/name-mangled is possible; that might well be the right answer here.

  • BTW, I strongly recommend recording all upvals used in each Lua function in its declaration e.g. with -- upvals: x, y, z as I do in my code, especially when said upvals are used RW within the routine and can therefore create side-effects. You can find these by mining the -l listing.

Good plan; will do.

  • Is the SNTP object intended to be a singleton or not? (I see that it is currently written as mutli-instance -- that is the application can create multiple SNTP servers). The reason that I ask this is that if it makes more sense to make it a singleton then the locals used as upvals / internal functions can be hoisted one level rather than including them inside new().

I think it is slightly simpler to have the application construct a new one and discard the old one if servers change, so I had not intended it to be a singleton. That said, maybe there's not that much gain in simplicity and nobody's ever going to want more than one sntp client instance anyway. It can go back to being a singleton like the existing one if there's desire.

@marcelstoer marcelstoer reopened this Jul 26, 2019
@marcelstoer marcelstoer mentioned this pull request Jul 26, 2019
4 tasks
nwf added a commit to nwf/nodemcu-firmware that referenced this pull request Jul 27, 2019
This is a generalization of `wifi.sta`'s and `wifi.ap`'s `getip` and
`getmac` calls.  I don't propose to deprecate those, but perhaps we
should, in the documentation, point users at this function instead.

The direct motivation is to permit continued use of DHCP-provided NTP
servers in a future where
nodemcu#2819 has landed, now
that nodemcu#2709 is in the
tree.  But rather than exposing just that information, a more general
interface seems useful.
nwf added a commit to nwf/nodemcu-firmware that referenced this pull request Jul 27, 2019
This is a generalization of `wifi.sta`'s and `wifi.ap`'s `getip` and
`getmac` calls.  I don't propose to deprecate those, but perhaps we
should, in the documentation, point users at this function instead.

The direct motivation is to permit continued use of DHCP-provided NTP
servers in a future where
nodemcu#2819 has landed, now
that nodemcu#2709 is in the
tree.  But rather than exposing just that information, a more general
interface seems useful.
@HHHartmann
Copy link
Member

  • Is it worth making the C module sntppkt truly private (or not easily accessible from general Lua) because if it is only used by the "trusted" SNTP module then we can slim down on some of the validation and error checking. ...

If we want to allow the Lua part to be extensible by the application programmer, we should not slim down on error checking too much. Also there is no guarantee that the user packs the matching version of the Lua module. So this should be able fail gracefully with a meaningful error message and not just "not work".

@nwf
Copy link
Member Author

nwf commented Aug 16, 2019

The intent was to expose callbacks from the Lua layer much as is done now but to limit the direct extensibility of sntp.lua itself, again, just as we currently discourage modification of (or at least, require acknowlegement of being on one's own when modifying) sntp.c.

@nwf nwf mentioned this pull request Sep 12, 2019
4 tasks
marcelstoer pushed a commit that referenced this pull request Dec 9, 2019
* Remove app/include/netif/wlan_lwip_if.h

This file appears to be unused in our tree.

* New `net.if.info` call to show LwIP information

This is a generalization of `wifi.sta`'s and `wifi.ap`'s `getip` and
`getmac` calls.  I don't propose to deprecate those, but perhaps we
should, in the documentation, point users at this function instead.

The direct motivation is to permit continued use of DHCP-provided NTP
servers in a future where
#2819 has landed, now
that #2709 is in the
tree.  But rather than exposing just that information, a more general
interface seems useful.
vsky279 pushed a commit to vsky279/nodemcu-firmware that referenced this pull request Dec 27, 2019
* Remove app/include/netif/wlan_lwip_if.h

This file appears to be unused in our tree.

* New `net.if.info` call to show LwIP information

This is a generalization of `wifi.sta`'s and `wifi.ap`'s `getip` and
`getmac` calls.  I don't propose to deprecate those, but perhaps we
should, in the documentation, point users at this function instead.

The direct motivation is to permit continued use of DHCP-provided NTP
servers in a future where
nodemcu#2819 has landed, now
that nodemcu#2709 is in the
tree.  But rather than exposing just that information, a more general
interface seems useful.
@nwf nwf mentioned this pull request Feb 21, 2020
@nwf nwf changed the title DNM: WIP: Rewrite sntp in Lua with only a little C WIP: Rewrite sntp in Lua with only a little C Feb 23, 2020
@nwf
Copy link
Member Author

nwf commented Feb 23, 2020

I've spent a while more hacking on this and gotten it passing some basic functionality tests. It would be good if someone (esp. @jmattsson) could review it, but I think I continue to like the feel of "hard bits in C, easy bits in Lua" development rather than "self-contained, complete C modules".

@marcelstoer
Copy link
Member

I don't feel I can add anything meaningful to the discussion here so maybe I better keep my mouth shut but I'm still tempted...sorry.

I continue to like the feel of "hard bits in C, easy bits in Lua" development rather than "self-contained, complete C modules".

Is that the perspective of the firmware/module maintainer or that of the module user? As an end user having a self-contained C module available feels more convenient to me personally (unless one uses LFS maybe). However, I accept that there are different kinds of "end-users". Terry stated

...private to the public Lua module which keeps the bulk of the login in a form that is accessible and extensible to the application programmer

and that's certainly a valid point. How many would want/need to extend that module, though?

@nwf
Copy link
Member Author

nwf commented Feb 23, 2020

From the LFS-capable end-user perspective, the two are interchangeable. Without LFS, it's hard to imagine NodeMCU as anything but a toy, and so I am not particularly worried about the heap bloat from the "easy bits in Lua" code.

I'm sorry to be harsh, but the quality of existing C modules suggests that very few developers, and I exclude myself, are capable of producing production-quality code.

@TerryE
Copy link
Collaborator

TerryE commented Jun 4, 2020

@nwf, out of interest, if there any specific reason for us deciding not to use the SDK's built-in SNTP functions (SDK/include/sntp.h) and instead doing our own implementation?

BTW, the SNTP interface is documented in the non-OS SDK API reference and is (I believe) a backport of the ESP32 RTOS equivalent.

@nwf
Copy link
Member Author

nwf commented Jun 4, 2020

[Previous incorrect comment deleted]

It looks like I removed the LwIP sntp.c in c972d86 since we had the existing one, which was more aware of the native hardware RTC. And I've replicated that RTC awareness here. We could try adding Lua callbacks to the LwIP sntp?

Copy link
Collaborator

@TerryE TerryE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See also my comment about why we are reimplementing built-in SDK functionality.

app/modules/sntppkt.c Show resolved Hide resolved
app/modules/sntppkt.c Outdated Show resolved Hide resolved
lua_modules/sntp/sntp.lua Outdated Show resolved Hide resolved
local self = {}
self.servers = serv

local _tmr -- contains the currently running timer, if any
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You clearly are following some convention so prefixing some -- but not all -- local names with "_". What is this convention?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not particularly meaningful any more; the code's been through a lot of refactoring.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should now be that all new()-local upvals are _-prefixed identifiers.

-- Shut down the state machine
--
-- upvals: _tmr, _udp, _six, _sat, _res, _best
local function _stop()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The destructor in general the cleanup needed. Also list assignments are useful. This routine body could have been encoded

    _tmr = _tmr and _tmr:unregister()
    _udp = _udp and _udp:close()
    _six,_sat,_res, _best = nil
    for k,v in pairs(_kod) do _kod[k] = (v > 0) and (v - 1) or nil end

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_udp:close() does not unregister callbacks. While I don't intentionally create knots through the registry, it felt defensive to do that in the destructor.

Copy link
Collaborator

@TerryE TerryE Jun 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_udp:close() does not unregister callbacks

No but _udp:__gc() does, and as _udp:close() returns nil this derefences the socket which then triggers the proper GC including clearing down the registry entries.

But there is really a separate net tidy up issue that closing a socket should automatically unref any CBs in it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but we won't get to _udp:__gc() if there are knots tied through the registry. Again, I don't mean for there to be, here, but it seemed generally like a belt&suspenders kind of approach.

And yes, as per #3062, I think net needs some tidying. If you think the contract is that :close() should rip all callbacks out, it would be good to a note to that effect that there. I have generally shied away from that particular option because the object outlives :close() and can, in principle, be pressed back into service. At least, the TCP objects can; UDP is a little harder to justify.

now = now or (rtctime and rtctime.get)
if now == nil then error "Need clock function" end

local self = {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incidentally why not local self = {servers = serv} ?

I more general point given that self is a module local, this is a singleton class when I am not sure this is the intent. Shouldn't self be local to new() so

local sntp = require'sntp'.open(a,b,c,d) works as expected

Also in general the object should be the first param so the ":" form of method invocation works as expected.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self is local to new? The only module locals are MAX_SERVER_ATTEMPTS, SNTP_TIMEOUT, new, update_rtc, and go. Or am I missing something?

While I generally understand the : calling convention, I don't see its relevance here? Neither of the exported functions on the "SNTP object" created by new (i.e., sync() and stop()) take arguments, as all their associated state is in their upvals. The callbacks are held as upvals not as table members.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, yes self is as you say, but I got a bit lost in the depth of semantic nesting. For others, a call v:name(args) is syntactic sugar for v.name(v,args), except that this takes less VM instructions and v is evaluated only once.

Here is the size / upvalue analysis of the module (hope you find this useful).

main     <L0,0>     (15 ins, 5 const)     
function <L6,201>   (145 ins, 25 const) -- upvals:  SNTP_TIMEOUT, MAX_SERVER_ATTEMPTS
function <L45,63>   (50 ins, 9 const)   -- upvals:  _tmr, _six, _sat, _res, _best, _udp, _kod
function <L71,143>  (42 ins, 10 const)  -- upvals: _tmr, SNTP_TIMEOUT, _udp, doserver, now, _kod, fc, serv, _six, _self, nextServer,
                                                    _pbest, _res, _best
function <L72,76>   (15 ins, 4 const)   -- upvals: _udp, doserver
function <L82,137>  (142 ins, 21 const) -- upvals: ip, txts, now, _kod, fc, serv, _six, _self, nextServer, _pbest, _res, _best, _tmr
function <L146,157> (31 ins, 2 const)   -- upvals: _sat, MAX_SERVER_ATTEMPTS, fc, serv, _six, _self, nextServer, _udp, doserver, hail
function <L153,156> (16 ins, 3 const)   -- upvals: doserver, hail
function <L162,179> (45 ins, 4 const)   -- upvals: _six, serv, _res, _pbest, _best, _stop, sc, _self, fc, _sat, doserver
function <L184,191> (18 ins, 6 const)   -- upvals: _stop, _udp, _tmr, _six, nextServer
function <L193,197> (11 ins, 1 const)   -- upvals: _res, _best, _stop
function <L204,246> (58 ins, 16 const)             
function <L251,273> (26 ins, 6 const)   -- upvals: new, update_rtc
function <L252,252> (1 ins, 0 const) 
function <L254,266> (40 ins, 9 const)   -- upvals: update_rtc, freq, sc
function <L270,270> (6 ins, 2 const)    -- upvals: sntpobj

This is a pretty chunky piece of code that takes 10kB RAM to load if you are using LFS. That's a big enough chunk to think about how we could optimise this. Have a look at how I've done this for the FTP server.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that's supposed to be "10kB RAM if you're not using LFS"? If this is 10K even with LFS then I really am in trouble. I'll see what I can do to drive it down all the same. (I continue to vaguely wish that Lua had a const keyword.)

Again, I know what : does. I know it's syntactic sugar and I understand the VM instructions it expands into. What I don't understand is why you bring it up. Am I foo.bar(foo,...)-ing somewhere in here? Am I failing to provide an interface that looks like it should support that?

Copy link
Collaborator

@TerryE TerryE Jun 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that's supposed to be "10kB RAM if you're not using LFS"?

Correct. The static RAM usage if loaded into LFS is zero, though executing code does create runtime variables. Lua constants are just that and are stored as TValues in the code space (and which is why I listed them above). Each instruction takes 4 bytes, each TValue takes 12 bytes in Lua51 and 8 bytes in Lua53. Proto structures and (debug) lineinfo vectors also take code space, though the NodeMCU lineinfo is typically 10-20× more dense than in standard Lua). String constants are a TValue pointing to a ROstrt entry.

Unlike PHP or Javascript, Lua has no concept of an array constant; arrays are constructed in RAM at runtime, though the opcodes which do this construction are instruction that can be in LFS. The only exception to this was introduced by eLua and built upon in NodeMCU; this is the ROTable construct, but these are declared statically in C code as const structures and mapped into Flash memory.

The reason that I refer to potential RAM footprint is in the case of non-LFS usecases. If we move to @HHHartmann's concept in #3128 of separate sys LFS and app LFS partitions with the sys LFS being non-optional and built into the firmware at make (or your variant), then this is a non-issue, but as I discuss in this issue this really limits Flash-only C/Lua hybrid module to using the Lua 5.3 engine. A separate discussion, I think.

@TerryE
Copy link
Collaborator

TerryE commented Jun 4, 2020

It looks like I removed the LwIP sntp.c in c972d86 since we had the existing one, which was more aware of the native hardware RTC. And I've replicated that RTC awareness here. We could try adding Lua callbacks to the LwIP sntp?

We might have a perfectly good reason, but it was still worth asking the Q, I think. I think that Johny's original SMTP module predated Espressif's addition of its own.

@nwf
Copy link
Member Author

nwf commented Jun 4, 2020

Thanks for useful feedback. Here's a revised commit.

marcelstoer pushed a commit that referenced this pull request Jun 9, 2020
* Remove app/include/netif/wlan_lwip_if.h

This file appears to be unused in our tree.

* New `net.if.info` call to show LwIP information

This is a generalization of `wifi.sta`'s and `wifi.ap`'s `getip` and
`getmac` calls.  I don't propose to deprecate those, but perhaps we
should, in the documentation, point users at this function instead.

The direct motivation is to permit continued use of DHCP-provided NTP
servers in a future where
#2819 has landed, now
that #2709 is in the
tree.  But rather than exposing just that information, a more general
interface seems useful.
@HHHartmann HHHartmann closed this Jan 2, 2021
@HHHartmann HHHartmann reopened this Jan 2, 2021
@HHHartmann HHHartmann closed this Jan 3, 2021
@HHHartmann
Copy link
Member

Sorry, trying to trigger a CI build

@HHHartmann HHHartmann reopened this Jan 3, 2021
@nwf
Copy link
Member Author

nwf commented Jan 3, 2021

CI builds are great, but could you please use a new PR that won't send me email? :)

@HHHartmann
Copy link
Member

Sure, sorry to annoy you, I was just on the way to give several PRs the chance to run against the new tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants