Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Payjoin integration #257

Closed
wants to merge 11 commits into from
Closed

Payjoin integration #257

wants to merge 11 commits into from

Conversation

jbesraa
Copy link
Contributor

@jbesraa jbesraa commented Feb 21, 2024

April 18 Update:

This pull request adds two new features:

  1. Allowing the node to accept payjoin transactions to its bitcoin wallet.
  2. Allowing the node to open a channel from funds received in a payjoin transaction, saving the user an additional on-chain transaction.
  • The first 3 commits are about adding the new lightning_payjoin crate and will be moved to a different repo once the code is approved.

  • Added new crate lightning_payjoin responsible for gluing together ldk-node and rust-payjoin . The library exports a Receiver trait and a LightningPayjoin struct which gives the ability to handle payjoin transactions.
    The main code is basically living in this crate including the code for accepting normal payjoin transactions, handling lightning channel opening from payjoin transaction, validating transactions and other http stuff.

  • The current flow for opening a channel with payjoin transaction:

    1. Node runner calls open_payjoin_channel function
      open_payjoin_channel will:
      1.1 Add new channel to scheduler
      1.2 The channel manager will open a new channel and stop after ChannelAccepted msg because its part of the ChannelScheduler
      1.3 Function will return BIP21 for the user to pay
    2. Node runner scan BIP21 with a wallet supporting payjoin
    3. Node will accept the payment on the payjoin_server_port its listening on
    4. Look for scheduled channel from ChannelScheduler and change the output_script to the funding tx multisig
    5. Call channel_manager.funding_generation_ready
    6. Return response to payjoin sender after FundingSigned received from counterparty node

    We could change the order so when the user calls open_payjoin_channel its only adding a new channel to the ChannelScheduler and when the payments comes in we could start the channel opening and finalize it before we return a response. It would require exposing a bit more functionality to the PayjoinHandler like PeerManager and probably the Logger. there will be some duplicate code if we change it as the channel opening defined in lib would also be in payjoin_handler. I dont have a strong feeling about either.

  • The current implementation is following Payjoin V1 which requires the receiver to run an http server where in Payjoin V2 its possible to only have an http client. IMO its worth first supporting v1 as thats whats currently widely used by wallets and using v2 now could limit the usage of this feature. as all of the http code is setting in lightning-payjoin I think its a good choice for now to go with v1.


Original

The main goal of this pull request is to allow users to create channel with funds outside the node.

Flow that should be covered:

  1. Alice run Node A enabling "payjoin" and acting as a "payjoin receiver"
  2. Alice generates payjoin uri
  3. Alice receives a request for payjoin payment from Charlie
  4. Alice want to use the funds from the payjoin payment to open a channel with a single transaction.
  5. Alice negotiates channel opening with Bob(using the psbt it received in step 3)
  6. Alice get FundingSigned from Bob
  7. Alice send the finalized psbt to Charlie
  8. Charlie broadcast the tx

** This PR should cover Alice part **

There are few tricky points, mainly around 6 and 7. Currently transactions are immediately broadcasted after FundingSigned, that wont work with payjoin because the transaction wont be valid.

The channel opened is Outbound channel from Alice -> Bob but it will be broadcasted by Charlie, so we need to add some functionality to let Alice node know that its an Outbound channel.

some reference for payjoin integration with lnd
https://github.com/payjoin/nolooking

@jbesraa jbesraa changed the title Payjoin integration WIP Payjoin integration Feb 21, 2024
@DanGould
Copy link

This sequence diagram might help those getting familiar with the how lightning channels can be opened from payjoin transactions

    The `merchant` hosts the payjoin receiver.

     ┌──────────────┐                ┌─────────────────┐                  ┌──────┐
     │Lightning Peer│                │    `merchant`   │                  │Sender│
     └──────┬───────┘                └───────┬─────────┘                  └───┬──┘
            │                                │                                │
            │            BOLT 2              ├─────── Bip21 with ?pj= ───────►│
            │     Channel Establishment      │                                │
            │                                │◄────── Original PSBT ──────────┤
            │                                │                                │
            │                                │                                │
            │◄──────── open_channel ─────────┤                                │
            │                                │                                │
            ├──────── accept_channel ───────►│                                │
            │                                │             BIP 78             │
            │                                │                                │
            │◄─────── funding_created ───────┤                                │
            │                                │                                │
            ├──────── funding_signed ───────►│                                │
            │                                │                                │
            │                                │        PayJoin Proposal        │
            │                                ├──────       PSBT       ───────►│
            │                                │                                │
            │                                │                                │
            │                                │    ┌─ PayJoin + Funding ───────┤
            │                                │    │     Transaction           │
            │                                │    │                           │
           x│xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx│xxx ▼ xxxxxxxxxxxxxxxxxxxxxxxxxx│x
           xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
           xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx BITCOIN NETWORK xxxxxxxxxxxxxxxxxxxxx
           xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
           x│xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx│xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx│x
            │                                │                                │
            │◄────────channel_ready ─────────┤                                │
            │                                │                                │
            ├──────── channel_ready ────────►│                                │
            │                                │                                │

@jbesraa jbesraa force-pushed the payjoin branch 4 times, most recently from 67e887f to 5e0bbc6 Compare February 26, 2024 14:48
@jbesraa
Copy link
Contributor Author

jbesraa commented Feb 26, 2024

A few points to discuss:

  1. In the PR hyper is used to start and deal with the http_server. while all hyper functionality could be implemented using tokio, I choose it first as its faster to setup and some functionality like shared state is more intuitive to do.
  2. The node runner will need to set whom they are interested to open channel with, I added two end points to http server to do that. this need concept ack.
  3. payjoin_handler will probably need to access channel_manager(see the todos in the function) but it doesnt implement clone and I am not feeling very comfortable adding it). I would appreciate input about this.
    @DanGould

@DanGould
Copy link

DanGould commented Feb 26, 2024

  1. In the PR hyper is used to start and deal with the http_server. while all hyper functionality could be implemented using tokio, I choose it first as its faster to setup and some functionality like shared state is more intuitive to do.

Hyper serves https and makes use of the async tokio runtime being used in LDK node already. This makes more sense to me than rewriting an http stack.

Indeed, a payjoin receiver needs to share concurrent state. I suggest writing down exactly what state needs to be shared and what each component it's shared with needs to do with it, and whether or not it needs its own thread.

  1. The node runner will need to set whom they are interested to open channel with, I added two end points to http server to do that. this need concept ack.

Indeed, this is the first documentation of who needs what state.

payjoin_handler in my opinion should only handle payjoin POST requests. Seems like node management is done elsewhere by exposing methods on the Node which could then be called by an downstream program. HTTP should probably only be used to handle the payjoin protocol itself. Node control methods should be used to control it. I believe my suggestion is consistent with the current LDK-Node design.

  1. payjoin_handler will probably need to access channel_manager(see the todos in the function) but it doesnt implement clone and I am not feeling very comfortable adding it). I would appreciate input about this.

shared state can be passed as you figured out let clone_ccr = create_channel_request.clone(); but I'm not totally sure that a "CreateChannelRequest" should itself be that shared state. More likely, you want a shared scheduler service that listens for create channel request events. Nolooking has this scheduler architecture you can follow. payjoin_handler would then have shared access to the scheduler. Does LDK-Node offer batching already? This scheduler could be combined with that service if so. By sharing a listening service instead of a single request that gets mutated in place, the node could maintain a queue of user intentions with a clear, abstract, purpose-built API.

Clone would be a bit of a hack since channel_manager is a singleton service that needs to have its shared state altered while the program is running. I would guess that the channel_manager would be a dependency of whatever batched scheduler service you come up with. To share the state, choose an appropriate memory container and access pattern. This flow chart has helped me a ton. Rather than wrap each field of shared state in its own Arc<Mutex>, perhaps it would be appropriate to wrap the whole shared state in a single Mutex or other appropriate container.

image

@tnull
Copy link
Collaborator

tnull commented Feb 28, 2024

A few points to discuss:

  1. In the PR hyper is used to start and deal with the http_server. while all hyper functionality could be implemented using tokio, I choose it first as its faster to setup and some functionality like shared state is more intuitive to do.

  2. The node runner will need to set whom they are interested to open channel with, I added two end points to http server to do that. this need concept ack.

  3. payjoin_handler will probably need to access channel_manager(see the todos in the function) but it doesnt implement clone and I am not feeling very comfortable adding it). I would appreciate input about this.

I have yet to properly review and digest the whole proposed scheme. However, I'm generally very skeptical to go down the outlined route in respect to adding an HTTP server stack to LDK Node itself. If a reachable HTTP endpoint is a hard requirement for the proposed PJ scheme the handler probably should live in a different crate and we need to find a good interface in order to configure LDK node with it. This would be very similar to what we do with VssClient, which also lives as a reusable Rust crate, but we allow to configure it via corresponding Builder methods.

@DanGould
Copy link

If a reachable HTTP endpoint is a hard requirement for the proposed PJ scheme the handler probably should live in a different crate and we need to find a good interface in order to configure LDK node with it.

The big issue with Version 1 payjoin is exactly this dependency on an HTTP server stack. Fortunately the Payjoin Version 2 BIP forgoes this dependency for a mere HTTP client. This draft uses V1 but I imagine the merged PR will use V2. While the payjoin crate itself does no I/O, a crate to do basic payjoin networking for most implementations is a solid idea.

src/lib.rs Outdated
use tokio::net::TcpListener;

// Start the HTTP server for payjoin
let (payjoin_queue_sender, mut payjoin_queue_receiver) = mpsc::channel::<PayjoinRequest>(1);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DanGould starting a channel here that is accessed from the payjoin endpoint. the payjoin endpoint will add PayjoinRequest to the queue, and wait for response(funding psbt) via different channel from create_channel_from_pj.

src/lib.rs Outdated
) {
let psbt = pj_req.clone().original_psbt();
let pj_response = PayjoinResponse::new(psbt);
// Send OpenChannel message
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where the channel negotiations should happen and the queue the response back to the http handler

src/lib.rs Outdated
}

/// Request a new channel to be opened with a remote peer.
pub fn payjoin_channel(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function is currently doing two things which is not great.
it should be broke into two functions and in the "payjoin_uri" docs we should mention that the pending_channels array should be filled first

@jbesraa
Copy link
Contributor Author

jbesraa commented Feb 28, 2024

@DanGould In the last commit I addressed the scheduler we discussed in the last discussion.

Added a new struct PayjoinScheduler that is utilised on an incoming payjoin request, that only pushes the request to the queue and wait for response.

You can see now other new structs like PayjoinRequest, PayjoinResponse, PendingChannels and others added as well. those are not mature enough yet and they will probably change, this is just the start. I am also not a big fan of the naming yet.

mpsc channels are used so its consistent with other structs with similar functionality(like tx broadcaster).

Copy link

@DanGould DanGould left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My biggest confusion with the design is spawning loops to listen for mpsc channel updates. Rather than spawn a loop listening on an mpsc channel, you can use an async function like PayjoinScheduler::queue_channel and avoid that complexity. This can encapsulate create_channel_from_pj within a payjoin module.

I have the same concern with using mpsc loop to respond to payjoin requests. POST requests to /payjoin shouldn't schedule anything, they should check against the queue from the scheduler to attempt to pop from it and open a channel, otherwise accept the funds into a single-sig address. HTTP response must be synchronous anyhow, it does you no good to wait for a channel update if you can't respond to the HTTP request with it. There is no need to manually parse the PSBT from the request body, the payjoin crate has all of the methods to do that for you. Just marshal HTTP I/O data into and out of the payjoin crate methods. Define a function that takes the request and returns a response.

I think what you are trying to do is separate the scheduler logic from the http server, which does make sense. However, the server could reference the scheduler directly as a field instead of an opaque mpsc::Sender. This lets it interface with rich function signatures instead of channels in loops. Both the receiver server and the Node should share reference to the Scheduler and interface with its methods rather than mpsc channels.

src/pj.rs Outdated
Comment on lines 169 to 171
let res = match (http_request.method(), http_request.uri().path()) {
_ => payjoin_handler(http_request, state),
};

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think methods are being matched twice here for no reason. Unless you're going to transform a custom error type into an http response, you should probably call payjoin_handler directly without the match

@jbesraa
Copy link
Contributor Author

jbesraa commented Feb 29, 2024

If a reachable HTTP endpoint is a hard requirement for the proposed PJ scheme the handler probably should live in a different crate and we need to find a good interface in order to configure LDK node with it.

The big issue with Version 1 payjoin is exactly this dependency on an HTTP server stack. Fortunately the Payjoin Version 2 BIP forgoes this dependency for a mere HTTP client. This draft uses V1 but I imagine the merged PR will use V2. While the payjoin crate itself does no I/O, a crate to do basic payjoin networking for most implementations is a solid idea.

I would like not to rely on v2 for this integration for the following reasons:

  1. we link this PR to the approval of the bip which can delay it
  2. the single goal of this pr is allowing the node runner to open a channel with funds from a separate wallet, so we will end up with the user communicating with their self over a relay..

The idea of having a separate crate for a payjoin-server is very reasonable. We can start with v1 and once v2 is more mature we can upgrade the payjoin-server and do minimal changes in ldk-node.

@jbesraa jbesraa force-pushed the payjoin branch 9 times, most recently from fefdaf6 to b605919 Compare March 4, 2024 10:08
@jbesraa
Copy link
Contributor Author

jbesraa commented Mar 4, 2024

@DanGould @tnull
I think this is ready for review.

There are two main files to look into:
first pj_new_crate.rs, the code in this file will move to a separate repo+crate and will be added as a dependency to ldk-node. once you guys approve the content of the file, iam happy to do the above and break the commits accordingly.
in pjoin.rs I added the implementation ldk-node needs to write to integrate with pj_new_crate. notice that this is aiming to complete the initial full setup. and the bulk of the work is yet to be implemented. I added request_to_psbt to the PayjoinExecuter trait and thats where the channel opening functionality should go. that function might be broken to separate smaller ones, its just added as a template for now, but the trait must have async function and that breaks the msrv, I wonder if there is a way around it?

Integration tests added in integration_tests_payjoin.rs and they cover the currently functionality, but one problem I encountered was the inability to call async functions implemented in Node like .schedule_payjoin_channel from the test context. I tried to run under tokio::test but that failed and I think because it conflicts with the node runtime. but iam not sure tbh. would appreciate some guidance on that.

You can run the integration tests locally cargo test --test integration_tests_payjoin -- --nocapture
and there are unit tests in pj_new_crate.rs as well.

The LDKPayjoinExecuter in pjoin.rs is constructed with a few properties like Wallet, PeerManager and others. this is under the assumptions the those modules will be needed for the channel creation process. otherwise they can be dropped.

Needed to make a few changes in binding to reflect the changes in Config after adding payjoin_server_port.
In Kotlin & Python tests I added some randomness to the payjoin_server_port property as both nodes in tests used the default port and conflicted.

@DanGould
Copy link

DanGould commented Mar 4, 2024

I'm attempting a comprehensive design review of the abstraction intended to enable payjoin receipt for lightning channel opens funded directly by outside funds.

Async Design

The past review raised concerns with the async design, suggesting the use of async functions and scoped memory containers instead of mpsc channels. I am satisfied with the direction of this abstraction 🎊🚀 it's much easier to read, reason about, and should be easier to change.

New crate

@tnull voiced skepticism with regard to the inclusion of an HTTP server to receive payjoin in LDK-Node.

I'm generally very skeptical to go down the outlined route in respect to adding an HTTP server stack to LDK Node itself. If a reachable HTTP endpoint is a hard requirement for the proposed PJ scheme the handler probably should live in a different crate and we need to find a good interface in order to configure LDK node with it.

Assuming the stack is a hard requirement, I agree, an HTTP stack would benefit from its own crate. This could be shared between payjoin-cli and ldk-node for the purpose of accepting v1 payjoins. It would make sense to live in the rust-payjoin repository (payjoin-io ?), and could abstract both send and receive io functionality. For this reason, I'm skeptical of the drafted new crate including scheduling logic. It seems more natural for such a crate to handle only the http stack and leave lightning specifics to LDK-Node. Should we come across multiple implementations of lightning payjoin functionality, another payjoin-lightning crate could be born.

However, a payjoin v2 receiver is viable for this PR exclude the requirement for an http stack. @jbesraa raised 2 points I would like to respond to.

  1. we link this PR to the approval of the bip which can delay it

BIPs don't have an "approval" process per se, but an open workflow. Sure, the actual file can get approved to be included in the repository. But even that process includes implementing the BIP in the wild by multiple implementations before being merged.

  1. the single goal of this pr is allowing the node runner to open a channel with funds from a separate wallet, so we will end up with the user communicating with their self over a relay.

It's a good idea to narrow the scope of this PR to do one thing. For that reason, it makes sense to continue to document how a new crate might help, but in my opinion also suggests that the development of a new crate is outside of scope.

Regarding "communicating to their self over a relay," doing so forgoes the http server requirement and the TLS/onion hidden service setup that would otherwise be required. What exactly are the downsides you predict in using a relay? I have listened to concerns about IP and privacy leaks, but BIP 77 explicitly addresses these with OHTTP and authenticated encryption. Is latency a concern? I believe that's addressed by the asynchronous nature of BIP 77 as well.

Even with payjoin v2, a payjoin-io crate could still be useful for default payjoin http i/o implementations and ought to make its way into our payjoin roadmap.

Payjoin Executer

I'm not certain the PayjoinExecuter needs to exist as a named abstraction. While I admire the effort to design contained simple abstractions, it seems like the executer is an extra layer of complexity inside PayjoinScheduler since each of ts methods call into scheduler but it's not clear to me what exactly it's abstracting. My understanding of the suggested new crate was to implement the HTTP I/O to be maintained outside of LDK-Node, not lightning-specific payjoin operations.

The PayjoinExecuter is also defined as a trait in the new pseudo-crate module, but I'm not sure what it is generic over? If it's not generic over a particular service or data, What's keeping it from being a mere struct impl, or even just fields and methods inside PayjoinScheduler. I'm not sure of the purpose of request_to_psbt, since the payjoin state machine transformas a request entirely into a response. The necessary interface will probably not appear until the rest of the implementation is drafted, so I wouldn't let this design block attempting the rest of the integration.

serve in PayjoinScheduler

Assuming the ldk Node only hangs on to one piece of payjoin data, and it's OK to couple the payjoin server to the ldk-node implementation, I think including serve in the scheduler is a great idea. It simplifies the design a ton. I am a bit concerned that it constructs a new scheduler rather than cloning self, however. Doesn't that break the singleton pattern?

make_http_response

make_http_response("404".into()) will return a Response struct having Status code 200 with body "404". Make sure you're setting the status code and body appropriately. I prefer to build new mutable Response structs and setting their fields rather than the builder outside of tests so you don't have to handle errors with unwrap() too.

Testing

grug brained developer take is to avoid mocking, and I agree. An integration with a real payjoiner has examples and would prevent incorrect mocking from causing bugs that don't get caught down the road. It looks like despite the "mock_payjoin_sender" name you do in fact implement a regtest payjoin sender, and don't mock one. Funny, a payjoin-io crate could get rid of some of the complext http handling in these tests too. I'm seeing a pattern.

[I'm unable] to call async functions implemented in Node like .schedule_payjoin_channel from the test context. I tried to run under tokio::test but that failed and I think because it conflicts with the node runtime.

True, setup_node is blocking and tokio::test is a non-blocking context. You should be able to use tokio spawn_blocking to call such methods in a scope of that context. You may need to manage similar blocking vs non-blocking tasks with your reqwest clients. reqwest fortunatly makes both types available with/without the blocking feature.

Random port

Why use randomness when you can have a deterministic answer? Binding to port 0 makes the OS pick an available port for you. Even in python. Random port selection could lead to random test failure.

e.g. in rust

fn find_free_port() -> u16 {
    let listener = std::net::TcpListener::bind("0.0.0.0:0").unwrap();
    listener.local_addr().unwrap().port()
}

On Review submission

It's much easier to have some certainty about whether or not a design is possible if you can show passing CI, even if that means changing (and documenting) the CI itself. This concern relates to the new crate idea, too. If you want to prove out a new crate (IMO better as a separate PR/discussion at this stage), make a new crate in the same workspace rather than mock it with a module. That would help design its dependencies and interface concretely vs a mocked crate in a new module.

Final thoughts

This PR has come a long way to addressing the issue of getting channels open from a payjoin, especially with regard to managing concurrent tasks and I'm stoked to see that. Answering the questions about generic business PayjoinExecutor is doing (if any) will help address the async trait issue. From that answer you should be able to begin implementing the channel logic and get a better idea of what the implementation looks like inside the scheduler. From my view it appears like it can exist entirely inside the PayjoinScheduler.

Until some end-to-end implementation is committed bits and bytes. we're reviewing a design with one eye blind. It'll almost certainly change, and that's OK. Solid progress.

@jbesraa jbesraa force-pushed the payjoin branch 2 times, most recently from 2256634 to aa49b19 Compare May 19, 2024 08:52
@jbesraa jbesraa force-pushed the payjoin branch 13 times, most recently from 1bf6c10 to 289f096 Compare May 20, 2024 18:27
@jbesraa jbesraa force-pushed the payjoin branch 2 times, most recently from 281bb18 to 3d5b58d Compare May 21, 2024 10:09
@jbesraa
Copy link
Contributor Author

jbesraa commented May 22, 2024

I am breaking this PR into smaller PRs. First PR regarding the payjoin sender is opened here #295, and ill create another two prs to integrate the receiver, one for regular payjoin receiving and another for receiving with opening a lightning channel.

Ill leave it open for a bit longer as i am using it to check the e2e flow and how the CI is responding to changes, until I open the first payjoin receiver PR.

@tnull
Copy link
Collaborator

tnull commented May 22, 2024

I am breaking this PR into smaller PRs. First PR regarding the payjoin sender is opened here #295, and ill create another two prs to integrate the receiver, one for regular payjoin receiving and another for receiving with opening a lightning channel.

Ill leave it open for a bit longer as i am using it to check the e2e flow and how the CI is responding to changes, until I open the first payjoin receiver PR.

Sounds like a good approach!

@jbesraa
Copy link
Contributor Author

jbesraa commented Jun 11, 2024

closing in favor of #301

@jbesraa jbesraa closed this Jun 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants