As known, nginx modules are static linking.
Of course, you could also use dynamic modules, but they are limited to use dynamic symbols from nginx executable.
What if I need to use static variables and functions from nginx executable in my code?
In lua-resty-ffi
,
for the sake of extreme efficiency,
I need to inject the task directly into the global thread pool done queue,
which involves some static variables within ngx_thread_pool.c
.
Avoid patching?
The traditional way is of course patching the nginx and recompile it.
But the shorages are obvious:
- you need to integrate lua-resty-ffi into your build, which is cumbersome
- you can not use lua-resty-ffi in old releases of your product, because the binary should not be replaced
- each time lua-resty-ffi changes, you need to recompile it with your product
All in all, could we use a non-intrusive way? i.e. dynamic loading.
Resolve static variables
Let’s check what is static variable in the elf binary.
# nm /usr/local/openresty-debug/nginx/sbin/nginx
000000000027f890 b ngx_thread_pool_done
000000000027f8a0 b ngx_thread_pool_done_lock
000000000028c690 B ngx_cycle
00000000000841a0 t ngx_thread_pool_handler
000000000007b920 T ngx_calloc
Name | ELF Type | ELF Section | C semantic |
---|---|---|---|
ngx_thread_pool_done_lock |
local symbol | .bss/.symtab |
static variable |
ngx_cycle |
global symbol | .bss/.symtab/.dynsym |
external variable |
ngx_thread_pool_handler |
local symbol | .text/.symtab |
static function |
ngx_calloc |
global symbol | .text/.symtab/.dynsym |
external function |
C static variables are visable only inside the same compile unit, i.e. source file. You have no way to access them elsewhere in C world. Without patching the source file, how could I use it in my dynamic loaded shared library?
In fact, all defined symbols in the same final binary output (executable or library) are referenced
with relative addresses[1]. All undefined symbols are referenced via GOT
,
which will be resloved by dynamic linker at runtime
(function call is a bit special, which use plt
stubs to do support lazy binding).
Let’s confirm it in assembly code.
objdump -D /usr/local/openresty-debug/nginx/sbin/nginx
|
|
You could see that it uses %rip
relative addressing to locate the defined symbols, no matter whether it’s global or not,
e.g. 27f8a0 <ngx_thread_pool_done_lock>
, here 27f8a0
is the offset in .text
section, which
is 0x1fb83a
away from %rip
.
So, if we use global defined symbol in the nginx exectuable as anchor, plus a fixed offset, we could calculate the absolute address of the static variable!
However, we should be aware that the offsets are strictly bound to specific nginx executable. If the exectuable changes, the offsets may be invalid.
So it comes another question, how to guard that the nginx executable at runtime is identical to the version at compile time?
In fact, besides offsets, the C type (e.g. structure) definition and ABI/API compatibility also requires we should match the nginx executable version correctly. Nginx does not guarantee compatiblity among different version and builds, so even nginx dynamic modules also need to take care of this fact.
Retrieve build-id
Build id is what gcc used to reflect the identity of specific build. When you change the source code or compile options, the build id changes.
By default, gcc puts the generated build id into the .note.gnu.build-id
section.
binutils’ ld has supported the –build-id=… option since version 2.18 (released 2007). When used, with a sha1 or md5 argument it directs ld to insert an ELF section .note.gnu.build-id into the binary containing a hash of the normative parts of the output—that is, an identifier that uniquely identifies the output file.
Let’s check our nginx executable:
|
|
From ELF 64-bit LSB shared object
, we know it’s a PIE
executable,
and the build id is bda5fd746456c2453605499e4d4372c90bde73eb
.
So, retrieve the build id of the nginx executable at runtime and compare it with the build id we gather at compile time, we could ensure our static variables references are correct and safe!
We could just simply use API
from libdl.so
from libc6
to retrieve the build id.
Build shared library
Gather offsets and build-id at compile time
In the build script, we uses nm
and file
commands to gather necessary infomation into a C header file:
|
|
Somebody may ask, why not inspect the nginx executable at runtime to resolve addresses, after all the offsets
are always in the .symtab
section, then we could fit any version of nginx?
No, because besides symbol offsets, we also depend on the type definition and ABI/API compatibility of nginx.
So the build of lua-resty-ffi shared library is one-to-one bound to the specific version of nginx executable.
Moreover, inspecting elf information at runtime is not a trivial job, which requires to use API from libbfd
or libelf
.
Resolve symbols at runtime
|
|
Make within the openresty build context
We should enter the context of your openresty build and build lua-resty-ffi there, which ensures we have the same compile options as your product build, i.e. ensure lua-resty-ffi uses correct type definitions and API prototypes.
In fact, it’s same to what you develop nginx dynamic modules.
- specify your openresty source path in variable
$OR_SRC
- ensure openresty source are already configured and built according to your product release
|
|
|
|
|
|
Review the output
|
|
ngx_cycle
is undefined dynamic symbol in libresty_ffi.so
, and it’s placed in .got
section,
which will be resolved by dynamic linker to the absolute address in nginx executable.
Plus correct offset, we resolve the absolute address of ngx_thread_pool_done_lock
, which
is saved in the .bss
section.
You could confirm the .bss
and .got
section address range via readelf
:
# readelf --sections libresty_ffi.so
[22] .got PROGBITS 0000000000004fb8 00003fb8
0000000000000048 0000000000000008 WA 0 0 8
[25] .bss NOBITS 0000000000005120 00004120
0000000000000020 0000000000000000 WA 0 0 8
Adjust resty_ffi.lua to fit both ways
If it could not find lua-resty-ffi symbols in the nginx executable, e.g. ngx_http_lua_ffi_create_task_queue
,
it tries to load libresty_ffi.so
.
Note that it must load libresty_ffi.so
in global namespace, because the APIs of lua-resty-ffi must
be available for runtimes powered by lua-resty-ffi later.
|
|
Package it via luarocks
Now, you could install lua-resty-ffi via luarocks at ease:
|
|
|
|
Use lua-resty-ffi shared library
Since we do not patch lua-resty-core, we need to require
lua-resty-ffi before we use ngx.load_ffi()
.
|
|
Conclusion
With some “black magic”, we could build lua-resty-ffi as shared library, then no need to patch your openresty/nginx and rebuild it anymore.
lua-resty-ffi enables you to use your favorite mainstream programming language, e.g. Go, Java, Python, Rust, or Node.js, to do development in Openresty/Nginx, so that you could enjoy their rich ecosystem directly.
https://github.com/kingluo/lua-resty-ffi
-
In ancient ages, the executable, even shared library, uses absolute addressing. But nowadays,
PIE
andPIC
is default option, as well asASLR
. In brief, the start address of the mapping of exectuable and library is determined randomly at startup or loading. All symbol references use relative addressing instead.