Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates to ABI handling in preparation for semver #409

Closed
wants to merge 26 commits into from

Conversation

jgriffiths
Copy link
Contributor

@jgriffiths jgriffiths commented Aug 25, 2023

Motivating issue: #408

Initially wally was intended to always be included in the source tree along with the app that used it, or at least to be statically linked with the using app. Notably, wally can be built with or without Elements support, and whether it is enabled changed the ABI in two ways: (1) the exposed library structures have Elements-specific fields for Elements builds and (2) Elements-specific functions and internal functionality were only defined if the caller defined BUILD_ELEMENTS when compiling and including the headers from the library.

Note that wally itself is always statically linked with the libsecp256k1 that it uses; by default this is the Blockstream zkp branch which exposes further APIs used by Elements.

For the Python and Javascript packages available through PyPI and npm respectively, wally is always released with Elements support enabled. This allows users to know that all the functionality they may need will be available out of the box.

However, at least one distro (Gentoo) is now installing wally as a global shared library that is re-used by several components. Users may customize the library build and may decide to e.g. disable Elements support when installing. This poses a problem for apps that expect this functionality to run or can optionally run with it present; Because the ABI changes for each build type such apps may either corrupt memory (because expected structure sizes differ between the app and library) or will fail to start (because the linker cannot resolve the Elements calls the app expects).

This PR addresses these issues as follows:

  1. The library version number is now exposed at compile time via WALLY_MAJOR_VER/WALLY_MINOR_VER/WALLY_PATCH_VER/WALLY_BUILD_VER and at runtime by calling wally_get_build_version().
  2. Elements is now enabled by default when building. Users do not need to pass --enable-elements to configure or define BUILD_ELEMENTS when including library headers.
  3. Users can build without Elements support explicitly by passing --disable-elements to configure. This configuration is ABI compatible with non-Elements builds; all structures are identical and all Elements APIs are present (they return WALLY_ERROR when called, and their actual implementation is compiled out of the library). When built this way, wally can be installed as a global shared library and apps can gracefully fail if the version does not match their expectations or if they expect Elements functionality to be present and wally_is_elements_build() returns 0.
  4. For static linking and/or embedded development where Elements functionality is not required and memory/code size is critical, it can be completely compiled out of the library by passing --disable-elements --disable-elements-abi to configure, and defining WALLY_ABI_NO_ELEMENTS when including wally headers. Such builds are ABI incompatible with standard builds under (2) or (3) and must not be installed as global shared libraries in the general case except at the users risk (for example, a BTC-only dedicated device, O/S or docker image where all software is pre-installed).
  5. With this PR wally moves to release 1.0.0 and begins following semantic versioning as per https://semver.org/.

cc: @rustyrussell @k-matsuzawa @Sjors @whitslack - Please review and if possible test these changes and report any issues here.

Fixes #408
Fixes #388

@jgriffiths jgriffiths changed the title WIP: Build updates Updates to ABI handling Aug 29, 2023
@jgriffiths jgriffiths changed the title Updates to ABI handling Updates to ABI handling in preperation for semver Aug 29, 2023
@jgriffiths jgriffiths changed the title Updates to ABI handling in preperation for semver Updates to ABI handling in preparation for semver Aug 29, 2023
@jgriffiths
Copy link
Contributor Author

@whitslack note this change also exposes the transaction accessor functions which were the cause of many build failures with various configure options. I would really appreciate it if you could test the new options, thanks!

@whitslack
Copy link
Contributor

@jgriffiths: Thanks for your work on this. I greatly appreciate it, both stabilizing the ABI w.r.t. build options and moving to semantic versioning.

I'll start working on a Gentoo ebuild for 1.0.0 today and will report back with my findings.

… ABI compliance

As wally is now being installed as a global shared library by at least
one distro (Gentoo), update how the ABI is handled to ensure that
elements and non-elements builds are compatible at the C structure
level:

1. By default, Elements support is now enabled.
2. By default, if Elements support is disabled, the C structs defining
   library types retain the extra Elements members.
3. Allow removing the extra members by configuring with
   --disable-elements --disable-elements-abi. The caller must define
   WALLY_ABI_NO_ELEMENTS before including any wally headers. This
   configuration can be used for Bitcoin-only embedded environments,
   but must not be used when installing wally as a system shared
   library.

Currently elements specific functions remain undefined in the wally
headers unless BUILD_ELEMENTS is defined by the caller. Note that the
Python and Javascript wrappers are unaffected as these releases always
enabled elements.

Callers can determine if the library was compiled with elements support
by calling wally_is_elements_build().
Future error handling will support extended error codes via these
functions which are user-overridable in order to support thread safety.
Make the structure changes now without using them so we can avoid an ABI
version bump once implemented.
…s (1)

Part 1, header changes.

As with the previous c-struct change, by default Elements is enabled and
elements functions are exported from the library. Using
--disable-elements now leaves the Elements functions available, but
calling them will always return WALLY_ERROR.

This behaviour allows installing a system-wide wally built without
Elements support which applications can gracefully detect at runtime via
wally_is_elements_build() and handle by degrading functionality or
failing to start.

To compile the Elements functions out completely, the user must configure
with --disable-elements-abi and define WALLY_ABI_NO_ELEMENTS when including
library headers. This allows e.g. embeddeded/static builds to eliminate
all Elements code entirely.

As before, WALLY_ABI_NO_ELEMENTS builds must not be installed as
system-wide shared libraries. Doing so may result in either memory
corruption at runtime (if no Elements code is used) or linker errors on
startup due to missing Elements calls.
@jgriffiths
Copy link
Contributor Author

jgriffiths commented Aug 30, 2023

Updated and rebased over #410 which has been merged first.

@whitslack
Copy link
Contributor

whitslack commented Aug 31, 2023

@jgriffiths: I'm seeing some test failures (at 0a915e9)…

PYTHONDONTWRITEBYTECODE=1 python3.12 test/test_map.py
.E.
======================================================================
ERROR: test_map (__main__.MapTests.test_map)
Test map functions
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/var/tmp/portage/net-libs/libwally-core-1.0.0/work/libwally-core-release_1.0.0/src/test/test_map.py", line 85, in test_map
    self.assertEqual(m.contents.items[n-1].value_len, vl)
                     ~~~~~~~~~~~~~~~~^^^^^
ValueError: NULL pointer access

----------------------------------------------------------------------
Ran 3 tests in 0.004s

FAILED (errors=1)
PYTHONDONTWRITEBYTECODE=1 python3.12 test/test_psbt.py
......EE..F
======================================================================
ERROR: test_psbt (__main__.PSBTTests.test_psbt)
Test creating and modifying various PSBT fields
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/var/tmp/portage/net-libs/libwally-core-1.0.0/work/libwally-core-release_1.0.0/src/test/test_psbt.py", line 266, in test_psbt
    ret = wally_psbt_input_set_required_lockheight(psbt.contents.inputs[0], 499999999)
                                                   ~~~~~~~~~~~~~~~~~~~~^^^
ValueError: NULL pointer access

======================================================================
ERROR: test_redundant (__main__.PSBTTests.test_redundant)
Test serializing redundant finalized input information
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/var/tmp/portage/net-libs/libwally-core-1.0.0/work/libwally-core-release_1.0.0/src/test/test_psbt.py", line 360, in test_redundant
    ret = wally_psbt_input_set_redeem_script(psbt.contents.inputs[0], redeem, redeem_len)
                                             ~~~~~~~~~~~~~~~~~~~~^^^
ValueError: NULL pointer access

======================================================================
FAIL: test_valid (__main__.PSBTTests.test_valid)
Test deserializing and roundtripping valid PSBTs
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/var/tmp/portage/net-libs/libwally-core-1.0.0/work/libwally-core-release_1.0.0/src/test/test_psbt.py", line 100, in test_valid
    self.assertEqual(pre_hex, post_hex)
AssertionError: '0100000002710ea76ab45c5cb6438e607e59cc0376[268 chars]0000' != '0200000002710ea76ab45c5cb6438e607e59cc0376[268 chars]0000'
Diff is 645 characters long. Set self.maxDiff to None to see it.

----------------------------------------------------------------------
Ran 11 tests in 0.042s

FAILED (failures=1, errors=2)
PYTHONDONTWRITEBYTECODE=1 python3.12 test/test_transaction.py
.......F.
======================================================================
FAIL: test_serialization (__main__.TransactionTests.test_serialization)
Testing serialization and deserialization
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/var/tmp/portage/net-libs/libwally-core-1.0.0/work/libwally-core-release_1.0.0/src/test/test_transaction.py", line 85, in test_serialization
    self.assertEqual(utf8(self.tx_serialize_hex(tx_copy)),
AssertionError: b'010[74 chars]000008b483045022100da43201760bda697222002f5626[321 chars]0000' != b'010[74 chars]0000000ffffffff0123ce0100000000001976a9142bc89[43 chars]0000'

----------------------------------------------------------------------
Ran 9 tests in 0.008s

FAILED (failures=1)

Interestingly, all tests pass when run on Python 3.11.5. The failures occur on Python 3.12.0rc1(p6). Releases 0.8.8, 0.9.0, 0.9.1, and 0.9.2 all exhibit the same test failures on Python 3.12. (I have not tested other releases.)

gentoo-repo-qa-bot pushed a commit to gentoo-mirror/bitcoin that referenced this pull request Aug 31, 2023
Multiple unit tests are failing with dev-lang/python-3.12_rc1_p6.

See: ElementsProject/libwally-core#409 (comment)
@jgriffiths
Copy link
Contributor Author

@whitslack very strange! I assume this is your first time testing v3.12?

what does test map print if run with:

diff --git a/src/test/test_map.py b/src/test/test_map.py
index f74e721e..d8a00a58 100644
--- a/src/test/test_map.py
+++ b/src/test/test_map.py
@@ -81,6 +81,8 @@ class MapTests(unittest.TestCase):
             self.assertEqual(wally_map_get_item_length(m, i - 1), (WALLY_OK, vl - 1))
             # Adding an existing key ignores the new value without error.
             vl = vl if case == cases[-1] else vl - 1
+            print(n-1, m.contents)
+            print(m.contents.items, m.contents.num_items)
             self.assertEqual(m.contents.items[n-1].value_len, vl)
 
         # Find an integer key with no integers in the map

?

@whitslack
Copy link
Contributor

I assume this is your first time testing v3.12?

It's possible. On my system my ebuild happens to test libwally-core using Python 3.11 when the doc USE flag is enabled because then the Python interpreter used at build time has to be one that has Sphinx, and I only install Sphinx for Python 3.11, which is currently Gentoo's default Python interpreter. If the doc USE is disabled, then there is no such requirement imposed, and the ebuild uses the newest installed Python interpreter, which is Python 3.12. I thought I had tested my libwally-core ebuild under all permutations of USE flags after I had installed a Python 3.12 release, but it is possible that I had not.

what does test map print if run with:

Under Python 3.11, I get:

PYTHONDONTWRITEBYTECODE=1 python3.11 test/test_map.py
...
----------------------------------------------------------------------
Ran 3 tests in 0.004s

OK
0 <util.wally_map object at 0x7fe6c220f5c0>
<util.LP_wally_map_item object at 0x7fe6c220f530> 1
1 <util.wally_map object at 0x7fe6c220f6e0>
<util.LP_wally_map_item object at 0x7fe6c220f5c0> 2
2 <util.wally_map object at 0x7fe6c220f530>
<util.LP_wally_map_item object at 0x7fe6c220f6e0> 3
2 <util.wally_map object at 0x7fe6c220f5c0>
<util.LP_wally_map_item object at 0x7fe6c220f530> 3
2 <util.wally_map object at 0x7fe6c220f6e0>
<util.LP_wally_map_item object at 0x7fe6c220f5c0> 3

Under Python 3.12, I get:

PYTHONDONTWRITEBYTECODE=1 python3.12 test/test_map.py
.E.
======================================================================
ERROR: test_map (__main__.MapTests.test_map)
Test map functions
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/var/tmp/portage/net-libs/libwally-core-1.0.0/work/libwally-core-release_1.0.0/src/test/test_map.py", line 86, in test_map
    self.assertEqual(m.contents.items[n-1].value_len, vl)
                     ~~~~~~~~~~~~~~~~^^^^^
ValueError: NULL pointer access

----------------------------------------------------------------------
Ran 3 tests in 0.004s

FAILED (errors=1)
0 <util.wally_map object at 0x7f2133afe250>
<util.LP_wally_map_item object at 0x7f2133afe4d0> 0

@jgriffiths
Copy link
Contributor Author

@whitslack m.contents.num_items is not being updated under python 3.12, I think this needs debugging, I'll look into testing with that version.

@whitslack
Copy link
Contributor

Incidentally, I exhaustively tested all allowed combinations of USE flags at 30c744c (plus #414) using Python 3.11, and all builds were successful, so the only hurdle that I see remaining on Gentoo is this issue of test failures with Python 3.12. It's not a blocker, though, as I have already dropped python3_12 from the ebuilds' PYTHON_COMPAT lists, so users will not be able to install the wallycore Python module for Python 3.12, and the unit tests will not be run under Python 3.12, even if the user has a Python 3.12 interpreter installed.

@jgriffiths
Copy link
Contributor Author

Thanks, I should have time tomorrow to review your remaining changes and will look into 3.12 asap after that.

The POSIX Shell 'test' built-in does not define a '==' operator. It
works if /bin/sh happens to be Bash, but that is non-standard.
SWIG generates code with unused parameters that trigger hundreds of warnings.
version-info is what determines the dotted triplet of numbers trailing
the shared library filename. Meticulous maintenance of version-info
allows multiple ABI versions of the library to be installed on a system
concurrently and also makes obvious when library clients need to be
recompiled for a newer ABI (and when no recompilation is necessary).

See: https://www.gnu.org/software/libtool/manual/html_node/Updating-version-info.html
Quoting jgriffiths:
> --enable-openssl-tests is no longer supported in secp and should be removed

Suggested-by: Jon Griffiths <[email protected]> (@jgriffiths)
See: #414 (comment)
libsecp256k1 dropped support for bignum in October 2020.

See: bitcoin-core/secp256k1@1f233b3
Starting up a JVM is not a lightweight affair. For this reason,
typically all Java source files are compiled by a single invocation of
javac. Use GNU Make's "grouped targets" feature to specify that all Java
class files are produced by a single execution of the recipe. Further,
avoid compiling the test classes if tests are disabled.
@jgriffiths jgriffiths force-pushed the build_updates branch 3 times, most recently from 476ac41 to 81ccbab Compare September 4, 2023 05:34
jgriffiths and others added 6 commits September 4, 2023 19:28
Altering ac_configure_args to configure libsecp256k1 with options
differing from those of libwally-core causes issues if the wally
Makefile is out of date and needs to call autoreconf etc to
regenerate itself. The secp arguments are passed back to the top
level configure which changes and possibly breaks the original
configuration.

Instead, import and use AX_SUBDIRS_CONFIGURE from the autoconf-archive
which supports sub-confiration with bespoke arguments.

Because autoreconf does not recurse into subdirectories named by
AX_SUBDIRS_CONFIGURE, we lose the automagic Autotools (re)generation in
src/secp256k1, so add an explicit call to src/secp256k1/autogen.sh in
tools/autogen.sh.
Extracted from a patch by Matt Whitlock <[email protected]>.

Also reduce the include directories given for python wheel building.
(Modified from the original submission).

Libwally-core, in its stock configuration, compiles and statically links
against a private copy of libsecp256k1_zkp. However, Gentoo unbundles
libsecp256k1_zkp and instead links libwally-core against a system-wide
shared libsecp256k1_zkp with headers in /usr/include/secp256k1_zkp and a
libsecp256k1_zkp.pc that specifies the relevant CFLAGS and LIBS.

Implement a --with-system-secp256k1 configure option to allow compiling
and linking against a system-installed libsecp256k1. Pass the user-
specified package name (or 'libsecp256k1' or 'libsecp256k1_zkp' by
default, depending on --enable-standard-secp) to PKG_CHECK_MODULES to
find the CFLAGS and LIBS of a system-installed libsecp256k1. Call
AC_CHECK_FUNCS to assert that the required modules are present in the
system-installed libsecp256k1.

If the user does not specify --with-system-secp256k1 (or specifies
--without-system-secp256k1), the build will use the bundled copy of
libsecp256k1_zkp as before. Remove the existing libtool hack: When
building as a static library, the user will need to link libsecp256k1.a
in addition to libwallycore.a.
@whitslack
Copy link
Contributor

m.contents.num_items is not being updated under python 3.12, I think this needs debugging, I'll look into testing with that version.

@jgriffiths: All tests are passing under Python 3.12.0rc2. I guess there must have been a bug in 3.12.0rc1 that affected libwally-core.

gentoo-repo-qa-bot pushed a commit to gentoo-mirror/bitcoin that referenced this pull request Sep 11, 2023
But require >=dev-lang/python-3.12.0_rc2:3.12, as tests fail under rc1.

Also:
 * set DISTUTILS_EXT=1 to suppress a QA warning;
 * python_check_deps(): only require Sphinx in compile phase.

See: ElementsProject/libwally-core#409 (comment)
@jgriffiths
Copy link
Contributor Author

@whitslack great news. I'm slightly sidetracked working on wallet policies for Jade but will be back to this after.

@jgriffiths
Copy link
Contributor Author

Merged elsewhere, closing.

@jgriffiths jgriffiths closed this Oct 30, 2023
@jgriffiths jgriffiths deleted the build_updates branch October 30, 2023 12:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Please expose compile time and runtime libwally versions Expose transaction accessors
2 participants