As known, nginx modules are static linking.
Of course, you could also use dynamic modules, but they are limited to use dynamic symbols from nginx executable.
What if I need to use static variables and functions from nginx executable in my code?
for the sake of extreme efficiency,
I need to inject the task directly into the global thread pool done queue,
which involves some static variables within
The traditional way is of course patching the nginx and recompile it.
But the shorages are obvious:
- you need to integrate lua-resty-ffi into your build, which is cumbersome
- you can not use lua-resty-ffi in old releases of your product, because the binary should not be replaced
- each time lua-resty-ffi changes, you need to recompile it with your product
All in all, could we use a non-intrusive way? i.e. dynamic loading.
Resolve static variables
Let’s check what is static variable in the elf binary.
# nm /usr/local/openresty-debug/nginx/sbin/nginx 000000000027f890 b ngx_thread_pool_done 000000000027f8a0 b ngx_thread_pool_done_lock 000000000028c690 B ngx_cycle 00000000000841a0 t ngx_thread_pool_handler 000000000007b920 T ngx_calloc
|Name||ELF Type||ELF Section||C semantic|
C static variables are visable only inside the same compile unit, i.e. source file. You have no way to access them elsewhere in C world. Without patching the source file, how could I use it in my dynamic loaded shared library?
In fact, all defined symbols in the same final binary output (executable or library) are referenced
with relative addresses. All undefined symbols are referenced via
which will be resloved by dynamic linker at runtime
(function call is a bit special, which use
plt stubs to do support lazy binding).
Let’s confirm it in assembly code.
objdump -D /usr/local/openresty-debug/nginx/sbin/nginx
You could see that it uses
%rip relative addressing to locate the defined symbols, no matter whether it’s global or not,
27f8a0 <ngx_thread_pool_done_lock>, here
27f8a0 is the offset in
.text section, which
0x1fb83a away from
So, if we use global defined symbol in the nginx exectuable as anchor, plus a fixed offset, we could calculate the absolute address of the static variable!
However, we should be aware that the offsets are strictly bound to specific nginx executable. If the exectuable changes, the offsets may be invalid.
So it comes another question, how to guard that the nginx executable at runtime is identical to the version at compile time?
In fact, besides offsets, the C type (e.g. structure) definition and ABI/API compatibility also requires we should match the nginx executable version correctly. Nginx does not guarantee compatiblity among different version and builds, so even nginx dynamic modules also need to take care of this fact.
Build id is what gcc used to reflect the identity of specific build. When you change the source code or compile options, the build id changes.
By default, gcc puts the generated build id into the
binutils’ ld has supported the –build-id=… option since version 2.18 (released 2007). When used, with a sha1 or md5 argument it directs ld to insert an ELF section .note.gnu.build-id into the binary containing a hash of the normative parts of the output—that is, an identifier that uniquely identifies the output file.
Let’s check our nginx executable:
ELF 64-bit LSB shared object, we know it’s a
and the build id is
So, retrieve the build id of the nginx executable at runtime and compare it with the build id we gather at compile time, we could ensure our static variables references are correct and safe!
We could just simply use API
libc6 to retrieve the build id.
Build shared library
Gather offsets and build-id at compile time
In the build script, we uses
file commands to gather necessary infomation into a C header file:
Somebody may ask, why not inspect the nginx executable at runtime to resolve addresses, after all the offsets
are always in the
.symtab section, then we could fit any version of nginx?
No, because besides symbol offsets, we also depend on the type definition and ABI/API compatibility of nginx.
So the build of lua-resty-ffi shared library is one-to-one bound to the specific version of nginx executable.
Moreover, inspecting elf information at runtime is not a trivial job, which requires to use API from
Resolve symbols at runtime
Make within the openresty build context
We should enter the context of your openresty build and build lua-resty-ffi there, which ensures we have the same compile options as your product build, i.e. ensure lua-resty-ffi uses correct type definitions and API prototypes.
In fact, it’s same to what you develop nginx dynamic modules.
- specify your openresty source path in variable
- ensure openresty source are already configured and built according to your product release
Review the output
ngx_cycle is undefined dynamic symbol in
libresty_ffi.so, and it’s placed in
which will be resolved by dynamic linker to the absolute address in nginx executable.
Plus correct offset, we resolve the absolute address of
is saved in the
You could confirm the
.got section address range via
# readelf --sections libresty_ffi.so  .got PROGBITS 0000000000004fb8 00003fb8 0000000000000048 0000000000000008 WA 0 0 8  .bss NOBITS 0000000000005120 00004120 0000000000000020 0000000000000000 WA 0 0 8
Adjust resty_ffi.lua to fit both ways
If it could not find lua-resty-ffi symbols in the nginx executable, e.g.
it tries to load
Note that it must load
libresty_ffi.so in global namespace, because the APIs of lua-resty-ffi must
be available for runtimes powered by lua-resty-ffi later.
Package it via luarocks
Now, you could install lua-resty-ffi via luarocks at ease:
Use lua-resty-ffi shared library
Since we do not patch lua-resty-core, we need to
require lua-resty-ffi before we use
With some “black magic”, we could build lua-resty-ffi as shared library, then no need to patch your openresty/nginx and rebuild it anymore.
lua-resty-ffi enables you to use your favorite mainstream programming language, e.g. Go, Java, Python, Rust, or Node.js, to do development in Openresty/Nginx, so that you could enjoy their rich ecosystem directly.
In ancient ages, the executable, even shared library, uses absolute addressing. But nowadays,
PICis default option, as well as
ASLR. In brief, the start address of the mapping of exectuable and library is determined randomly at startup or loading. All symbol references use relative addressing instead.