The world needs Lightning Network

Table of Contents

Let’s make a promise: I will try to explain what Lightning Network is without going into (too) much detail, but you will have to give me an hour of your time. It will be a long and complex journey (but there is also joy).

If you want to support my work, you can visit the donations page. Every contribution, however large or small, helps me to spend more time writing, revising and updating these articles. Thank you for your support!

The priory is benevolent

Achtung 1: I haven’t written in a long time. The last time I edited an article was on 4 March 2023. In the meantime, I have started a new career as a software engineer while piecing together the puzzle of my life. When I feel balanced I come back here, to write. I like it.

Achtung 2: this article is a heavy reworking of Mastering the Lightning Network by the good Antonopoulos.

Achtung 2: to read this article without major difficulties I suggest reading or re-reading Bitcoin 101. You may encounter terms that I will not explain, some times because I consider them preparatory to this reading, others because I cannot go into the extreme detail of each concept.

We always start with the basics #

Lightning Network (like Bitcoin) is a trustless system, it allows the exchange of value without the need to trust other participants in this network.

But this system, what does it look like? Where is it located? How does it work?

Suppose I am a manager of a pizzeria. Every other day, my friend Kris comes to buy a pizza. Kris likes to pay in bitcoin but using timechain for quick spending is tiring. It doesn’t scale enough, so the payment is not quick and there are fees.

Kris and I agree and establish an imaginary communication channel in which we both deposit some money (satoshi) inside a safe. In the safe there is a slip of paper on which is written exactly how many satoshi Kris and I have deposited. This allows him to order as many pizzas as he wants until his share of satoshi is completely eroded.

What happens to the sheet is that every time Kris buys a pizza, it is updated, removing some sats from him and adding them to my name. Now that Kris and I have established a communication channel, any friend of Kris’s who has a direct channel to him can take advantage of our new connection.

Lightning’s goal is therefore to enable efficient payment `routing’ by exploiting existing channels and finding the minimum path to complete it.

Routing is the process of finding the best route in any transport system (whether physical or digital).

Brought down to reality: when you use the maps on your smartphone to search for the best route, you are actually routing.

The pizza example is very trivial and by no means complete or realistic with respect to how Lightning really works, but it is a good starting example to get into the topic.

Fairness protocol #

A fairness protocol is by definition a system where there is fairness between participants.

But how is it possible to aspire to fairness if not by requiring trust, by imposing e.g. a state of Law or by assuming the presence of trusted parties?

Silly question: I wrote a whole article on it: thanks to game theory and cryptography!

In cryptographic systems we place trust in the protocol. A protocol is nothing more than a system with a set of rules, and thanks to the incentives or disincentives in game theory we could aspire to a trustworthy protocol without the need for intermediaries. But fairness is only achieved if the rules are well written.

Let us take an example for clarity and transport this definition into the real world.

During lunchtime, two little brothers argue because they don’t want to share a plate of chips. Classic. The mother, taking matters into her own hands, authoritatively imposes her law by sharing both little brothers’ plates as she sees fit. She may, voluntarily or involuntarily, give more fries to one of them. Result? The system is not fair, we have a form of State of Law that makes a decision totally autonomously.

A more intelligent method might be to establish rules, such as: one of the two little brothers shares the plates but, the other, can choose which plate to eat first.

What effects does this new dynamic between the two bring? We have introduced incentives and disincentives because if the little brother who is in charge of sharing the plates tries to cheat, his brother can punish him by choosing the bigger plate.

Safety primitives #

In order for the example of these two little brothers to work, it must rely on certain indispensable primitives such as the sequential ordering of actions and intentional non-repudiation.

The choice of dish by one of the two cannot occur before portioning.
Both must undertake not to repudiate the choice of a dish.

Why Lightning? #

As mentioned in the pizzeria example, Bitcoin does not scale well. But this is not a problem, it is a feature. Bitcoin on layer 1 doesn’t have to scale and it won’t have to.

Remember that transactions are recorded on the timechain globally and as the demand for transactions increases one block gets saturated very quickly. Meanwhile, all others are put on hold. To bypass the queue and get out of the wait you can pay a higher fee, so fees go up for everyone indiscriminately.

If the demand for transactions continues to increase, more and more transactions are put on hold making microtransactions uneconomic because more satoshi of fees would be spent than the actual transaction amount.

To solve this ‘problem’ –which is not such– some ‘genius’ thought it best to increase the block size, but… we have already talked about this in Bitcoin 101. Run and re-read the block-size war.

How could we enable scalable, off-chain transactions without losing Bitcoin’s layer 1 security?

In February 2015, Joseph Poon and Thaddeus Dryja presented a possible solution to this question, publishing the whitepaper –now obsolete– The Bitcoin Lightning Network: Scalable Off-Chain Instant Payments.

The Lightning Network concept envisages the emergence of a new network, technically called layer 2, within which users can pay in peer-to-peer mode without having to register a transaction on the timechain. The only time the timechain is used is to open a communication channel and to do the settlement of this channel, i.e. the agreement to get the bitcoins out of Lightning and back onto layer 1.

In addition to a natural load reduction on bitcoin nodes, with Lightning payments the transaction fees would be very low and with greater privacy (in part, we will see) because these payments are not visible to the entire layer 1 like standard transactions.

Features highlighted by the whitepaper #

I anticipate that we will look at each point in the next few chapters, for now it may seem like a vague list but it is necessary for me to introduce the concepts and then describe them.

Routing of payments at very low cost and in real time.
Value exchange without waiting for the canonical block confirmation as on L1.
Payments are final, non-reversible and refundable only by the recipient.
Payments are not visible to the entire network.
Payments are not stored permanently as on L1.
Lightning uses the concept of onion routing, so that intermediate nodes involved in the transport of a payment only know the predecessor node and the next one. They do not know the sender and receiver.
The bitcoins used on Lightning are real bitcoins, this feature provides custody of the value and full balance control as in on-chain bitcoins.

How Lightning works #

We’ll stop playing from here on: if you’re tired, go make some coffee because you’re about to discover the wonders of the Lightning Network.

Basic technical prerequisites #

Digital signatures: is a method intended to prove the authenticity of a digital document.

Let’s explain with an example: you need to secretly write to a close friend of yours and you want to be sure that he knows that you are indeed the author of this letter. You have a magic key that allows you to sign the letter in a mathematical way, proving that only you can have written it. By using this magic key you are creating a digital signature.

Your friend has another magic key derived from yours, but he can only use it to verify your signature. He cannot create a signature in your name.

This is a digital signature.

The “magic” key is nothing more than a private key that allows you to spend funds, proving to the entire network that you are the owner at that exact moment.

Digital key: consists of a set of numbers that can be used to both encrypt and decrypt information.

Let’s explain with an example: you have a special and unique key that can only open a particular safe. This digital key is very powerful because it also allows you to sign documents or access protected resources. Only you should possess it and, when you use it, everyone else knows that you really are who you say you are.

It consists of two parts, a public and a private. The public one allows you to verify that you are you, but it doesn’t say anything about how to use it and it doesn’t work to open the safe. The private one, on the other hand, is the secret you have to keep.

Hash: is a mathematical function that transforms data of variable size into a string of fixed length.

Let’s explain with an example: imagine you have to transform a cooking recipe that has very precise doses into a unique number. There is a mathematical function that transforms ‘something’ into a number by assigning a fingerprint.

By changing even one gram of an ingredient in the recipe and reapplying the hash function the end result changes drastically, the number you get will be totally different. If someone were to change your recipe you would immediately realise that it is no longer the same as before. To summarise, it is a way of ensuring integrity and security in the digital world.

For the more pragmatic, a hash can be calculated from the terminal in this world

$ echo -n "musclesatz" | shasum -a 256 

This would give the result
fb4584b61ffaee257347cbff270e3c0bc9c504317685b68a72c1a47c400984f3

Bitcoin Transaction: is a data structure that encodes the transfer of value between various network participants.

A transaction that spends input, creates output. Transaction inputs are like references to the outputs of previous transactions and each transaction in turn generates new outputs.

Bitcoin nodes keep track of all these available and expendable outputs. This is why they are called unspent transaction outputs or UTXO for short. The collection of all UTXOs makes up the UTXO set and this set grows or shrinks as new UTXOs are created or consumed.

The outputs produced are discrete and indivisible units of value, this implies that an unspent output must be consumed in its entirety.

So you are telling me that if I have to pay 10k satoshi but I have a 3 bitcoin UTXO I spend the whole amount?

The good news is that you’ll pay the entire UTXO as input, but you’ll get two outputs:

one that pays the 10k satoshi
the other that returns the difference to you as resto.

It’s bad to say, but it works exactly like when you spend fiat currency. If you have to pay 0.10 cents for a goleador but you have a 10 euro note, you have to spend it in its entirety.

The only transaction that has no input is the coinbase transaction, a special transaction used by miners to keep the whole mining process going.

Each transaction then has an identifier called a transaction ID, or more briefly TxID. This ID is produced by a hash function on the transaction data. Instead, to identify a specific output of a TxID we use outpoints, simple numbers placed at the end of a TxID and preceded by a colon (:) to fix which input we are referring to.

Bitcoin Script: concludes this roundup of definitions and is the scripting language used in Bitcoin to define the conditions under which funds are released in a transaction. Put another way, it determines the rules that must be met in order for the funds in a transaction to be spent.

Bitcoin Scripts consist of two parts: - blocking scripts: these are embedded in transaction outputs and set the conditions required to spend an output. - Unblocking scripts: these are embedded in the inputs, fulfilling the conditions set by the blocking script.

To simplify with an example, if you had a blocked output from a blocking script that said:

3 + x = 5

It is easy to see that we can spend using the unlock script 2 in a transaction input.

Anyone validating this transaction would concatenate our unblock script (2) to the block script (3 + x = 5) with an affirmative answer, allowing the output to be spent. Of course, basic arithmetic is not covered in these scripts, in practice it requires the demonstration of knowledge of a secret, and here we return to the concept of a digital key.

Now that we have analysed the lock and unlock scripts, let us do the pizza example again, trying to get closer to how it actually works:

Kris pays musclesatz 10k bitcoins to buy a pizza.
The simplest blocking script requires a musclesatz signature to unlock the funds.
The script is something like <signature> <pubkey> CHECKSIG.

CHECKSIG takes two elements, a signature and a public key and verifies that it is mine. The public key is already in the block script, what is missing is the musclesatz signature corresponding to that public key. Only I possess (or should possess) the private key. Only I can generate a valid signature that allows me to spend those satoshi.

I must provide an unlock script containing my digital signature. The result of this operation will be TRUE!

<musclesatz signature> <musclesatz pubkey> CHECKSIG

There are of course other, much more complex types of scripts. Here a few examples.

What is a payment channel? #

Lightning Network is a peer-to-peer network of payment channels implemented as smart contracts (don’t think bad) on the timechain of Bitcoin. But this definition would be reductive because it is also a communication protocol that defines how the participants in this network execute these smart contracts.

A payment channel is a relationship between two nodes on Lightning. This relationship allows a balance (in millisatoshi) to be defined between these two Lightning nodes.

A Lightning node is a software capable of speaking the LN protocol. They have three basic characteristics:

They are wallets that send and receive payments over the Lightning network.
They must communicate peer-to-peer with other nodes.
They must have access to timechain to protect the funds used for payments.

The communication channel is protected by a cryptographic protocol that guarantees fairness by using cryptography and making it a de facto fairness system. This protocol is established when both participants contribute to the common fund in a multisig 2-di-2 address. Contributing to a 2-of-2 address implies that both parties must agree.

In a multisig 2-di-2 address, each participant holds a private key. This means that neither participant can authorise a transaction independently; both signatures must be present and verified to ensure that the transaction is carried out correctly.

Kris and I negotiate a series of private transactions that spend this balance, but without ever posting them on the timechain. The last transaction of the various sequences in our channel represents the current state of the channel and defines how this balance is split between Kris and I.

Doing a Lightning transaction is equivalent to moving a part of the balance to me if I am paid, to Kris if I have to pay him.

Any shifting of funds to one side or the other is handled by a smart contract set up to penalise a channel member if they try to send an old status, which belongs to the past and is therefore no longer valid.

Payment Forwarding #

When multiple participants in the network have multiple payment channels, these can be forwarded (remember routing?) from one channel to another, effectively defining a spatial path in the network connecting multiple channels.

We have said that channels are constructed from multisig addresses; what we have not said is that channel balance update transactions are pre-signed Bitcoin transactions. This implies that the trust needed to make LN work comes from the trust of the decentralised network par excellence: layer 1, Bitcoin.

What’s the point?

Lightning is an application on top of Bitcoin that makes use of Bitcoin transactions and its scripting language. It is a creative and clever way to enable an arbitrary amount of instant payments at very low fees, without the need to trust anyone but Bitcoin itself.

Returning to the topic of payment channels, let’s talk about the possible limitations:

The time taken by the internet to transfer a few hundred bytes (negligible).
The capacity of the channel, i.e. the amount of bitcoin committed when opening the channel.
The upper limit of the size of a Bitcoin transaction: since each Lightning payment is supported by a Bitcoin transaction that may still be in progress, the block size affects the amount of payments that can be active simultaneously on a single payment channel.

The funding transaction #

The founding element of a channel we have said is a multisig 2-of-2 address. One of the peers involved in opening the payment channel can fund it by sending satoshi to the multisig address. This transaction is called a funding transaction and cannot be distinguished on the timechain from any other transaction. It can only be discovered that it was a Lightning channel at the time of the settlement of the transaction, i.e. when the channel is closed.

The amount deposited with the funding transaction is called the channel capacity and defines the maximum amount that can be sent on that funded channel.

Achtung. The channel capacity does not define the upper limit of how much value can flow into the channel, because I remind you that funds can be sent in one direction but also in the opposite direction.

The refund transaction #

Suppose I now set up a payment channel with Kris. Kris, however, is mischievous. He made me deposit funds in a multisig 2-of-2 address but now refuses to co-operate with me and sign his transaction.

How do I unlock the funds?

To avoid these hassles I need to create a refund transaction in advance that spends from the multisig address refunding my satoshi. I need to have this refund transaction signed before I transmit the funding transaction to the address, i.e. before funding the channel.

Now I am protected!

The refund transaction is one of the transactions belonging to the class of commitment transactions.

The commitment transaction #

This transaction is an agreement between the peers in the channel that pays each peer its balance, ensuring that the two do not have to trust each other.

By signing the commitment transaction, you commit to the balance present at this exact moment on the channel. If I want to retrieve my funds from the channel, I can do so at any time, precisely because of the signing of this contract.

By the way, every time the balance of the channel ‘changes’, these commitment transactions are created, updating the new status of the channel and dividing the balance between what is due to me and what is due to my peer.

If my channel partner disappears? No problem. If my channel partner refuses to cooperate? No problem. If my partner tries to cheat me? No problem.

For now, the answer ’no problem’ may appear smoky and meaningless, later we will analyse what I have stated in detail.

Can you cheat? #

Back to my friend Kris, I open a channel with him by depositing 100k satoshi in a multisig 2-of-2 address. We exchange signatures and I transmit the transaction on the timechain.

We said that commitment transactions are also created each time the channel balance changes, so if we assume I send 30k to Kris, the new commitment transaction will say that the address pays 70k sats to me and 30k sats to Kris.

But now I have two commitment transactions, the first defining the initial state at time t0 by 100k sats and the second at time t1 –representing the current state– by 70k sats.

An unhealthy idea occurs to me: but if I published my previous 100k sats commitment transaction, would that mean that the address would now pay me 100k satoshi?

Bitcoin is resistant to censorship and nothing prevents me from publishing an old status that is no longer valid. Nothing except encryption, of course.

To prevent this kind of theft, commitment transactions are made so that if an old one is transmitted, voluntarily or not, I can be punished.

This is the disincentive of game theory, just like in the example of the little brothers with the plate of chips.

My disincentive to cheat is very high, because the moment I post an old channel status, Kris can club me and has the opportunity to claim the entire deposited balance of the address.

If I had very few satoshi I would have more incentive to cheat because my disincentive is so low, I can lose a few satoshi at most. This would not be painful. To solve this problem LN requires each participant to have a minimum balance in the channel called skin in the game.

We will analyse later how the penalty works because I will have to introduce the concepts of timelock delay and revocation secret.

Let’s announce the channel! #

How do I notify the entire network of the presence of my new payment channel? How do I make it public?

There is a protocol, called gossip, for communicating the existence, capacity and commission of my channel to other nodes.

The good thing about announcing a channel is that it becomes usable by other nodes for routing payments, even generating some credit fees to me. On the other hand, however, an unannounced channel has a certain degree of privacy, at least until the channel is closed on timechain.

Speaking of channel closure, when do you think it is good to close an LN channel?

No, it tends to be better never to close channels.

Closing them involves running an on-chain transaction, fees, revealing the presence of a channel, but most importantly…it doesn’t make that much sense except for specific reasons.

Keeping channels open is also good because the moment I run out of sending capacity, I can still receive thanks to `rebalancing’.

A channel can be closed in three different ways:

With an agreed closure, which is the correct way to do it:
- My LN node informs my peer’s LN node of my intention to close.
- Both nodes work towards closure.
- No more new routing attempts are accepted, while ongoing ones are resolved.
- The nodes prepare for the close transaction: the last state is encoded to define when balance is to be assigned to both peers.
- Agreement is reached on how to divide the closing fees on the timechain.
- Each channel partner receives its share of the remaining balance.
With a forced closure, which is the wrong way:
- I attempt to close the channel without the consent of the other participant.
- I publish the last commitment transaction of my node.
- After confirmation, there will be two expendable outputs: one mine, the other of the peer.

This is where the issue starts to get trickier, because if I have forcibly closed the channel I will have the output blocked by a timelock delay and will not be able to spend my bitcoins until some future time (usually two weeks, measured in block height on the timechain).

With a protocol violation, a very very bad way:

A protocol violation occurs when I attempt to cheat by publishing a commitment transaction that represents an old channel state. In order for my peer to notice this cheating attempt, they have to be online and keep an eye on the new blocks on the timechain and the transactions on them.

Although I have posted an old status I have a timelock preventing me from spending the balance so there is time for my peer to act and punish me. By sanctioning me, my peer is able to withdraw the full amount deposited (can you see the disincentive in trying to cheat?) and the closing becomes very very quick because there is no closing trade.

To end on a high note, my peer wants the transaction that punishes me to be accepted as soon as possible in a block, so he is also willing to pay the maximum fees on L1: he can pay them out of my part of the balance 😁.

But what if the timelock has expired and he has not noticed that I have posted an old status?

Unfortunately he will lose the funds, either totally or according to the last commitment status I managed to publish and complete, exceeding the timelock delay.

How do we detect a protocol violation against us?

With a properly managed Lightning node running 24/7 or with a personal or third-party watchtower.

A watchtower is a security service that allows the channel to be monitored when the user is offline. If suspicious activity is detected, it can intervene by protecting funds.

Invoice #

Most Lightning payments begin with an invoice generated by the person who is to collect, the recipient of the payment. This invoice contains within it the essential information for making a payment:

The payment hash.
The recipient.
The actual amount and possibly an optional description.

The payment hash is created by the recipient of the payment choosing in a safe and non predictable way a pseudorandom datum which is fed to a hash function. This random datum we call Pre-Image.

H = SHA-256(Pre-Image)

We have a payment hash. From the properties of hashes we know it cannot be inverted or forced, so no one can figure out what the Pre-Image is from the hash from the result of the hash. The Pre-Image is a secret and once revealed, anyone with the hash can verify that Pre-Image was indeed the secret.

The point is that the payment hash allows the payment to travel over multiple channels atomically: it either travels all the way to its destination or fails. There is no middle ground.

Invoices are generally sent outside of Lightning, using any communication mechanism. A very popular one is QR code for its convenience and compactness. The QR contains all the information discussed above.

Some more information on invoices:

They have an expiry date so that the recipient does not have to keep all the Pre-Images. When an invoice is paid or expires, it can be deleted.
They can contain routing hints that allow the sender to use unannounced channels to construct a path to the receiver (shadow channels, we will discuss this later in the article).

Pathfinding, routing #

These two terms are often confused:

Pathfinding* consists of finding the best path from the source to the destination.
The use of this path is called routing.

Lightning uses a source-based protocol for pathfinding and an onion-routed protocol for routing.

The standard way to search for a route is to test various routes iteratively until one is found with sufficient liquidity to allow forwarding payments. It may not be the method that minimises routing fees, but all in all it works decently.

It would be great (or maybe not, it depends) if we had the exact balances of each channel available, because then the pathfinding would be solvable by any student of a university course in operations research. This is not the case, the balances are not and cannot be known to the participants by the network.

Turning for a moment to routing, Lightning was very much inspired by the famous network Tor. The protocol on is not exactly like Tor because it only reuses the concept; on Lightning, the routing protocol is called Sphinx and works with exactly the same onion analogy as Tor: the sender builds the entire onion from the core to the outermost layer.

The payment information for the recipient is encrypted with a key that only the recipient can decipher; this information is the core of the whole routing operation. In the chapter on routing, we will analyse the construction of the onion step by step, also from a cryptographic point of view.

To give a general idea, payment is constructed in the shape of an onion, starting with the receiver and adding a new layer to the onion by proceeding backwards in the path found, thus from the receiver to the sender. The first strange of the onion, the outermost one, will be the sender’s channel peer receiving the onion packet to be forwarded into the network.

When the onion is sent, each node will only be aware of the node from which it received the onion and the node to which it will pass it, with no idea who the sender and receiver are.

A small clarification: you may be thinking that the packet is actually peeled like an onion (does an onion peel?).

In reality, when each node reads the part it is responsible for, it also adds a cryptographic ‘filler’ to bring the size of the message back to the original size intended by the sender (1300 bytes). This little game is done to protect privacy by preventing intermediate nodes from deducing information about the length of the path or the number of nodes involved, increasing overall security.

These small onions (small because they fit into a single TCP/IP packet) are constructed to have the same length throughout the routing path.

Onion forwarding #

I have to forward the onion. I finally find a papable route.

I forward the message to my peer and we say that each node processes one layer of the onion. Basically, each node receives a Lightning message called update_add_htlc with a hash of the payment and the onion. A payment forwarding algorithm then intervenes and performs these operations:

Decrypts the outer layer and checks the integrity of the message.
Confirms that it can satisfy routing suggestions based on fees and outgoing capacity.
Updates the status on the incoming channel.
Adds the famous filler data to keep the onion length constant.
It in turn routes the onion on its outgoing payment channel by sending an update_add_htlc which includes the same payment hash and onion.
It works with its channel peer to update the channel’s status.

But if there is a generic error in this process what happens?

What happens is that the communication propagates in the opposite direction back to the sender with an update_fail_htlc error message. Every node involved in routing also sees this message.

If you’re asking questions like: what is an HTLC? Why should every node know about the failure and receive this error message?

You will soon have all the answers you seek.

(optional) if you want to learn more about encrypting peer-to-peer communications in Lightning, I recommend visiting the Noise Protocol Framework website.

Channel backup #

More or less everyone is familiar with Bitcoin’s BIP-39, which allows us to retrieve the state of a wallet via a mnemonic. For those who don’t remember, Bitcoin Improvement Proposal 39 allows us to generate a sequence of English-language words from a public list that serves as the ‘seed’ for generating a deterministic wallet, with an almost infinite list of public and private keys.

But how does one make a backup in Lightning?

Lightning wallets also use the backup of the BIP-39 mnemonic, but only for the on-chain communication part. This is essential to understand.

A further level of backup is needed, for the channels. This backup is called Static Channel Backup (SCB) and comes into play whenever there is a change of state in the channel. This should make us ask questions, because if by bad luck an old (already revoked) commitment transaction is restored, our channel counterpart might think we are trying to cheat it, punishing us and requesting a penalty transaction and emptying the address.

Another aspect to consider is that SCB backups must be encrypted in order to maintain a high level of privacy and security: if I missed an unencrypted backup, anyone could use it not only to see my payment channels, but also to close them in order to hand over the balance to my counterpart.

Sweep #

What do I do if the balance in my Lightning wallet becomes too large and I want to reduce the risk?

I can do a sweep, of various types:

on-chain	off-chain	submarine swap
Move funds from the LN wallet to a Bitcoin wallet by cooperatively closing the channel, we have seen this before.	This mode involves running a second, unannounced LN node on the network. I use it as a piggy bank by regularly moving funds to this ‘hidden’ node, which I remember is a hot wallet anyway.	It is an on-chain versus off-chain exchange. It is atomic, meaning that if I initiate a submarine swap and send the balance of an LN channel, the other party will send me on-chain bitcoins in return.

Submarine swap #

I think it’s worth looking into submarine swaps because it often causes a lot of confusion:

Background:

musclesatz has bitcoins on-chain and wants to receive funds on LN (off-chain).
Kris has funds on LN (off-chain) and wants to receive bitcoins on-chain.
Mark is someone between musclesatz and Kris who coordinates the trade.

Kris generates a Pre-Image (secret) and hashes this secret, embedding it in an invoice for me.
I execute an on-chain transaction to a contract with a clause saying that these bitcoins are redeemable by providing the Pre-Image of the hash of Kris’s invoice and making a valid signature from Mark, to make sure no one but Mark can take them.
Mark sees this contract and knows that to redeem these bitcoins, he must necessarily pay Kris, because only then will Kris reveal the Pre-Image to him.
Kris receives payment from Mark and then reveals the Pre-Image to him.
Mark uses this Pre-Image and a valid signature to move the bitcoins to an address he controls.
I am satisfied that Kris was paid because Mark was able to move the funds in the contract, so he knew about the Pre-Image and so it was communicated to him by Kris who must have received necessarily payment.

If Kris had not been paid by Mark, Mark would not have been able to make the claim of the bitcoins in the contract and thanks to a particular clause in this contract (which incidentally are called the Hashed Timelock Contract - HTLC, we will see them in detail later) after a certain interval of time I would have been able to take back my bitcoins in this contract, because there was also a timelock in my favour.

This is the summary of everything I have explained in terms of Bitcoin Script.

OP_SIZE 32 OP_EQUAL
OP_IF
OP_HASH160 <ripemd160(swapHash)> OP_EQUALVERIFY
    <receiverHtlcKey>
OP_ELSE
    OP_DROP
    <cltv timeout> OP_CHECKLOCKTIMEVERIFY OP_DROP
    <senderHtlcKey>
OP_ENDIF
OP_CHECKSIG

A couple of comparisons with Bitcoin #

Lightning is built on Bitcoin, so far so good. It inherits some features and properties, but there are also some important differences:

Address and invoice: a Bitcoin address is reusable (not recommended on the privacy side) infinite times. LN invoices are used one-shot, for a specific amount. (there is an exception to this which is the keysend mechanism which we will see later)

TUXO selection and pathfinding: to make a payment on Bitcoin I have to spend at least one UTXO, whereas on LN payments do not require the consumption of an output because as we have seen it is a rebalancing of the balance present at the multisig address.
Mining and routing fees: on Bitcoin we pay fees to miners to include our transaction in a blockchain, but on LN network users pay fees to other network users because of routing payments through channels. This fee is composed of a base fee which is a fixed component paid for routing (each channel can have its own) and a fee rate which is the variable component of the payment, proportional to the value of the payment (another difference, on timechain the fee is not proportional to the value)
Bitcoin public transactions and Lightning private payments: the heart of the article. Timechain is public, Lightning payments are not.
Satoshi and millisatoshi: on timechain the smallest unit is the satoshi while on LN we also have the thousandths of a satoshi. At the settlement of the Lightning channel millisatoshi are rounded to the closest whole satoshi.

Lightning network software #

I started with the idea of leaving this chapter blank, so as not to make the article too heavy. If you are interested in knowing in detail and in a technical way how to set up a development environment to run Lightning let me know, you can contact me on X/Twitter or on telegram.

Lightning is not a product or a company but a set of open standards that define a common line of interoperability. There is no reference implementation to keep track of as with Bitcoin Core but the standard is defined through a set of directives called Basis of Lightning Technology (BOLT) which you can find on GitHub.

Since there is no consensus as on timechain anyone can build on top of the core directives and if the features become successful they can become part of BOLT.

The heart of Lightning #

Lightning consists of a complex set of protocols running on the Internet. I will classify them into 5 different layers where each layer utilises (and abstraction) the layer below it.

Achtung. All sections of this chapter have been voluntarily softened by removing mathematical demonstrations or overly technical details.

Layer
Network connection: defines the protocols that allow interaction in the network
Messaging: defines the protocols useful for formatting or encoding messages
Peer-to-Peer: defines the communication protocols between various LN nodes
Routing: defines path discovery and message routing protocols
Payment: defines the invoice payment interface

Payment Channels #

Reference table: peer-to-peer level, channel open & close and channel state machine.

To understand the concept behind how payment channels work, it is essential to ask yourself a question:

What does it mean to own Bitcoins?

Owning Bitcoins means owning a private key of a Bitcoin address that has at least one UTXO. This key allows me to sign a transaction and legitimises the fact that I am the owner of that balance, because nobody else knows about it.

But ownership may not always be in the hands of a single person. Bitcoin also allows multisig addresses where you need multiple private keys to sign a transaction. An easy to understand multisig scheme is the 2-of-3: it means that 2 out of 3 people are needed to sign a transaction and spend the balance of that address. The number 2 in this case is called a quorum.

But what if in a 2-of-2 scheme one of the parties does not cooperate?

There is no quorum, so the funds cannot be spent. Such a scheme would not be considered fairness and in fact there is a possibility to prevent this scenario with the refund transaction –already analysed at the beginning of the article–. One has to look at the refund transaction as a prenuptial agreement. Before funding a 2-of-2 address, I make sure I have an exit plan and cleanly separate my funds from the counterparty’s funds.

Let’s go back to the channel construction for a moment. What happens is that to create a channel with Kris, our two nodes must establish an Internet connection in order to start trading. Each node is identified by a public key in hexadecimal format, generated from a private root key held within the node. But that is not enough. We also need a network address to be reached and here we have two choices: TCP/IP or Tor.

We are then defining a node identifier which comes in the form ID@Address:Port but is still difficult to read. It would be better to embed it all in a QR code, wouldn’t it?

Just scan it and that’s it, we have the two nodes connected 😉

Now that the nodes are connected we can start thinking about building the payment channel and this is done by exchanging six messages (three for each peer) between our respective nodes:

open_channel / accept_channel: I send Kris my capabilities and expectations with an open_channel message, if Kris accepts my request he will respond with an accept_channel.
funding_created / funding_signed: I want to avoid being cheated so I create both the funding transaction with funding_created and the refund transaction to protect myself from possible cheating. If it’s OK with Kris, he’ll reply with a funding_signed. I am now comfortable transmitting my funding transaction (on-chain) to create and anchor the payment channel. This is not an instantaneous operation because we are operating on timechain so we will wait for confirmation of the lock.
funding_locked / funding_locked: as soon as the transaction has sufficient confirmations (defined in the initial accept_channel message) Kris and I will exchange a funding_locked message to start sending Lightning transactions.

During the channel constriction I did something quite unusual: I built two concatenated transactions. How did I do this if the funding transaction was not even transmitted on the timechain?

Thanks to a feature called Segregated Witness (SegWit) introduced in Bitcoin in 2017 I am able to reference transaction outputs using the transaction hash, instead of the output ID. This allows me to concatenate transactions are not transmitted.

Yes, what I wrote sounds like bullshit but it’s not, let me explain.

What I mean is that a refund transaction is valid if it is transmitted on the timechain and if the input of this transaction has both my signature and Kris’s signature. Even if my node has not yet transmitted the funding transaction I can in the meantime construct the refund transaction by computing the hash of the funding transaction and refer to it as input in the refund transaction. I know in advance that the reference will be valid because I have calculated the hash of it.

Now the channel is set, but all the liquidity is on my side. That means I can send satoshi to Kris, but Kris has no funds to send me. If I send some of my sats, the status of the channel would change and the commitment transaction would appear. The channel balances are updated.

The history of all commitment transactions should not be misleading and make one think of possible double spending. Only one of these transactions can be confirmed on the timechain. We rely on Bitcoin’s ability to prevent double spending.

Let’s assume that Kris and I start transacting satoshi and then we have generated a lot of commitment transactions. We come to a point where we have this balance in the channel:

If I want to close the channel by transmitting and confirming the commitment transaction I have, I cannot spend the balance for 400 blocks, whereas Kris can do so immediately. Of course the reverse is also true.

Why is there this timelock delay and what is it for?

We mentioned it earlier. It serves to allow Kris to exercise a penalty if I had passed on an old commitment transaction to cheat him by stealing sats. This timelock delay is negotiated in the channel building messages.

But what happens if I publish an old channel status and how can I be punished?

Every time the channel state is updated with a new commit, I get a cryptographic secret from my counterpart, called a revocation secret, related to the previous state. If I try to fool Kris, he would use the revocation secret of the previous state and have cryptographic proof that the state has been revoked and I am breaking the rules because I am closing the channel with an invalid state.

What gives Kris*** the ***power to punish me is the revocation secret which gives him mathematical proof of my cheating.

But I don’t want to cheat him and instead, I want to close the channel cooperatively. I negotiate a final commitment transaction called shutdown that pays each party its balance according to its current state. I specify a Bitcoin script that corresponds to the shutdown address of my wallet and tell Kris to do a shutdown transaction that pays my balance to this wallet. Kris will do exactly the same thing and in addition accept the cooperative closing. We can finally settle. I send a final message closing_signed in which I propose a transaction fee for the on-chain closing with my signature and if Kris agrees he will pay me back the same fee proposed with his own signature. If he does not agree he will propose a different fee. This closing cycle can go on until we find a solution that we both agree on.

Channel routing #

Reference table: layer routing, atomic and trustless multihop contracts.

To understand the vast world of routing, we can start with an example set in physical reality.

I have to send 10 coins of a very rare material to Kris, but I have no direct connection with him. Both of us, however, know and have a direct connection with Mark. How do I convince Mark to give 10 coins to Kris without getting cheated and making sure he doesn’t run away? And above all, how do I know that the coins have been handed over to Kris?

One possible solution might be to promise Mark 10 coins if he can prove to me that he has delivered 10 coins to Kris.

But why would Mark sign such a contract for no real benefit?

It’s actually not very convenient, so I could modify it again by promising Mark 11 gold coins if he can prove to me that he delivered 10 to Kris, in this way he would receive a commission of one gold coin, not bad.

There remains the problem of trust, however, so we decide to use a guarantee service called escrow.

Kris needs to be paid, so he generates a secret value R (for simplicity’s sake, let’s assume this value is: hello world!) subjected to a SHA-256 hash algorithm. The hash of R we’ll call payment hash and the secret that unlocks the payment payment Pre-Image.

At this point Kris sends me the payment hash via telegram. I don’t know the secret, but in the meantime I can rewrite my contract with Mark using this payment hash and saying:

Mark, I will reimburse you with 11 coins if you can show me a valid message (a Pre-Image) that matches this payment hash. You can acquire this message by establishing a contract with Kris. To ensure that you will be reimbursed I will lock these coins in an escrow before you establish the contract with Kris.

This contract protects my coins because in the meantime I am locking them in an escrow, but the key thing is that I will pay Mark if he shows me a valid Pre-Image for the payment hash. The Pre-Image is proof that Kris has been paid by Mark.

At this point Mark makes an identical contract with Kris, saying that he will reimburse him 10 coins if Kris can show a valid message that matches the payment hash. He also warns that he will be reimbursed after revealing the secret by putting the 10 coins in an escrow.

(But remember that Kris is the recipient and it is he who generated the Pre-Image, so he can show it to Mark and actually get paid).

All parties have a contract.

Kris sends the Pre-Image to Mark; he checks that the secret matches the payment hash and having confirmation orders the escrow to release 10 coins for Kris. Now Mark provides the Pre-Image to me, I check it and order the escrow service to issue 11 coins for Mark.

That’s it, all contracts are resolved. I paid a total of 11 coins, 1 of which was received by Mark as fees and the other 10 went to my recipient Kris.

Why does it work?

It works because with such a chain of contracts Mark could not escape having the coins locked in escrow, he escrowed them.

What’s the weakness?

If Kris decided not to release his Pre-Image, Mark and I would have had the coins stuck in escrow. But this problem is easily solved by applying a deadline to the contract, a timelock delay.

Lightning transactions are atomic, they either go through or they go KO!

Hash Time-Locked Contracts (HTLC) #

The routing fairness protocol used on Lightning is called a hash time-locked contract (HTLC) and is the one we just described in the example.

HTLCs use the hash of a pre-image of a payment as the secret which unlocks a payment. There is also another mechanism for routing called the point time-locked contract (PTLC) which is even more efficient and has better privacy as it depends directly on a new algorithm added in 2021 in Bitcoin called Schnorr signatures. Here some more information.

After reading and understanding the HTLC example, I would like to point out a small problem: each of those contracts can be unlocked by anyone who knows the Pre-Image. What happens if Kris spends both Mark’s HTLC and mine? It wouldn’t be a trustless system at all if that were the case.

The HTLC script must have an additional condition that associates each HTLC with a specific recipient: we do this by requiring a digital signature that matches each recipient’s public key, preventing anyone else from spending that HTLC.

Another small problem to be addressed:

What happens if some node goes offline or is uncooperative? How do I gently make the payment fail?

Cooperatively: the HTLC is ‘rewound’ backwards by removing it from the commitment transaction without changing the balance (more on this later)
Time-locked repayment: we have already discussed this, every HTLC includes a repayment clause linked to a timelock delay.

Forwarding of payments #

Reference table: peer-to-peer layer, adding, settling, failing HTLCs.

HTLCs seem to be a prerogative of multi-hop payments, but in reality the Lightning protocol also uses them for ’local’ payments within a channel. The reason behind this is to maintain consistency and the same protocol design at every point in the network. For a payment recipient, there is no difference between a payment made by their channel peer and a payment forwarded by their peer but on behalf of someone else.

Let us take the example of myself, Mark and Kris.

Mark and I have a channel with a balance of 70k sats on each side. I remind you that the commitment transaction that got us this far is delayed and revocable.
I want Mark to accept a 50k sats HTLC to forward to Kris. To do this however I need to send the details of this HTLC such as the payment hash and the amount of this payment. I do this with the update_add_htlc message.

Technically the message exchange goes like this:

The information Mark receives is enough to create a new commitment transaction that has the same balance for the channel state (mine and Mark’s) and a new output representing the HTLC offered by me. This new commitment will have 50k sats in the HTLC output, an amount that comes directly from me, so my new balance will be 20k sats.

If there is a channel state change, there must be a new commitment, and after Mark creates it, I sign it with the commitment_signed message.

So now Mark has a signed commitment, he has to acknowledge it and revoke the old commitment; he does this with the revoke_and_ack message, and this allows me to build a revocation key to create a penalty transaction. What’s happening is that Mark will no longer be able to publish his old, newly revoked commitment, otherwise I have the means to punish him (the revocation key). The old commitment has to all intents and purposes been revoked.

What is missing?

Well, I have yet to revoke my old commitment in my current state there is still no HTLC! I build a new mirror commitment containing the HTLC, which still needs to be signed by Mark. Exactly as Mark did, I do a revoke_and_ack and sign the new commitment with the commitment_signed. Now I also commit to no longer publishing the old status, I have granted Mark the key to revoke the old commitment.

At any given time, on the Lightning Network, one can also have hundreds of HTLCs per individual channel.

Now Mark and I have a new channel commitment that also contains the additional HTLC output, but the new budget still does not reflect the fact that I sent 50k to Mark. Payment will only be possible in exchange for proof of payment to Kris!

In fact let’s now assume that Mark and Kris do exactly the same thing. Kris is the recipient, the only one who knows the Pre-Image of the hash of the payment, so Kris can satisfy the HTLC with Mark immediately.

In fact that’s what he does, by sending an update_fulfill_htlc message to Mark. As soon as Mark receives this message, he immediately checks to see if this Pre-Image produces the payment hash; if it evaluates to the TRUE condition, this secret can be used to redeem the HTLC.

Mark also sends an update_fulfill_htlc to me, the Pre-Image is propagated from Kris to me. I will perform the same check as Mark by evaluating the hash of the Pre-Image.

This is exactly the proof that Kris has been paid!.

Mark and I can remove the HTLC from the commitment transactions and update our channel balances to reflect the new funding situation (obviously doing a commitment and revocation cycle again 🙂 )

But what if there is an error or a deadline?

In this scenario the process would develop in a similar way, unlike a failure message update_fail_htlc. I would have to remove the HTLC and go through a transaction cycle of commitment and revocation to move the channel state to the new commitment, with the balances set to how they were before the HTLC.

What about payments locally, on the same channel?

Replace Mark with Kris in the example. Profit.

Onion routing #

Reference table: layer routing, source-based onion routing (SPHINX).

The first sender node of the payment is called the source node. The last node, which is the recipient, end node. Each intermediate node between source and destination is called a hop and each hop must set an outgoing HTLC for the next hop. The information that I, the sender node, send can be called a hop payload or hop data, and the whole message is called an onion.

So if I rephrase the example in the previous paragraph using this new terminology, we can say that I built an onion with hop payload data and telling each hop how to build the outgoing HTLC to send the payment to Kris.

There are two possible formats I can use for the information to communicate to each hop, a legacy one called hop data and a more flexible one called hop payload.

Where should I start to construct this message?

Obviously from the hop payload that will be delivered to Kris, the final node. This is enough to say that the message Kris will receive will be different because of the presence of a field which will have all values set to 0. This field is called short_channel_id and is a channel reference which I, as the hop payload builder, value to refer to channels. Kris is my recipient, so I populate it with all 0s because Kris will not construct an outgoing HTLC as he is the recipient of the payload.

Of course no one but Kris will know this information because it is encrypted.

I can then begin to serialise the onion message in a format called Type-Length-Value valid for hop payloads. I will then also prepare the hop payload for Mark.

Within the hop payload there are three fields: short_channel_id (already addressed), the amt_to_forward which is the amount in millisatoshi, and the outgoing_cltv_value which represents the timelock delay of the HCLT, expressed in future height in timechain.

If amt_to_forward were 50100 satoshi, Mark expects a fee of 100 sats for routing the payment to Kris. Expectations on timelock and routing fees are set by the difference between the two HTLCs, the incoming and outgoing.

Suppose there is another node between me and Mark, to complicate the routing of the onion message a bit.

I have three payloads: I have constructed them for Kris, for Mark and for Lucas. These three hop payloads will be onion-wrapped!

Well, now I have to generate several keys, which I will need to encrypt the various layers of the onion:

I have to make sure that only the final node can read it.
Each intermediary must only know the previous node and the next node.
No one will need to know the total path length.
Each intermediary will be able to verify that the message is not tampering.

The construction process is called packet construction. An interesting trick used in onion routing is to make the path the same length for each node, as if each node saw the onion still at the beginning of its path with another 19 possible hops ahead (because the maximum onion routing path is 20 hops).

As each layer is ’eliminated’, junk data is also added to serve as a filler to make the payload always the same length.

All beautiful, but how is this onion created in practice?

First, the onion is built starting with the final node, i.e. the recipient of the message, which is Kris.

I start by creating an ’empty’ field of 1300 bytes generated pseudo-randomly with a certain key.

I start to insert the hop payload I constructed earlier by inserting it (abstractly) from the left, thus having to move the filler further to the right. This operation causes an excess in size because we exceed 1300 bytes, so a small piece of the filler will “fall out” and be deleted.

At this point Kris’s data is in the clear in the onion message, so I must somehow obfuscate it with a key called rho that Kris also (and only) knows. To perform the obfuscation a pseudo-random data stream is generated from this key and an XOR operation is performed between the hop payload and this stream generated by rho.

The XOR is used because when Kris performs another XOR operation it will be able to see the contents of the hop payload. This is because of the characteristics of the XOR operator: if it is applied once to a piece of data you have a different result from the original one, but if you reapply it a second time you go back to the source message, as if “going back”.

Finally, to guarantee the authenticity and integrity of Kris’s piece, another key called mu is used to compute a HMAC which is nothing more than a checksum based on the hash of the entire hop payload.

Now it’s Mark’s hop payload’s turn. I perform the same steps, the difference being that now I no longer start from 1300 bytes of filler, but from 1300 bytes containing Kris’s hop payload, suitably obfuscated 🙂 .

Inserting Mark’s hop payload from the left, I have to be very careful not to overwrite part of the data of Kris’s! I also calculate the HMAC here with the mu key and obfuscate everything with the rho key. This time, however, the keys mu and rho are known only to Mark and me because this hop is just for him.

Two notes:

We have two HMACs, a newly added external one from Mark, and an internal one from Kris.
When Mark receives the onion message he won’t be able to quantify the number of payloads inside, for him it will always look like the first payload out of the possible 20, precisely because we’re keeping the length fixed with the trick we’ll see shortly.

I also perform the same steps for Lucas. The onion payload is ready and I can send it to Lucas, who is my new channel partner. In the end, it will be composed as follows:

1 byte for the onion version, in order to make a distinction for future format updates of the protocol.
33 bytes of a public session key which will be used by each node to calculate the famous obfuscation keys and to calculate the checksum (and also a third key which I haven’t told you about called pad).
Our onion payload of 1300 bytes.
32 bytes of the outermost HMAC to be verified by my channel partner, Lucas.

I am ready to send the update_add_htlc message to Lucas.

It will first have to update the channel state with a commitment transaction, but we’ve already seen this part in detail, so I’ll skip it. The “new” operations that Lucas will need to do will be to take the session key, generate the mu key, and verify the HMAC checksum. Once the integrity and authenticity has been verified, Lucas will need to retrieve its hop payload and then forward the onion to the next hops.

Problem: if Lucas removes his hop payload we’d lose the fixed 1300 byte length (very bad)

Here comes the trick I mentioned: each node must generate a filler before sending the onion to the next node.

What Lucas does is double the 1300-byte hop payload and put them side by side. He now has two 1300-byte hops. The second hop sets it all to 0 (zero-filled). It generates the rho key from the session key and performs an XOR between the 2600 byte hop payload (one true, the other zero-filled) with a 2600 byte stream generated from the rho key.

What happens is that the first 1300 bytes that contained the true hop payload are de-offset, because we have applied an XOR for the second time, so Lucas can read its hop payload, instead the XOR between the 1300 zero-filled bytes and the stream generated from the rho key will produce pseudo-random data.

In this way Lucas will read what it owes, remove it from the hop payload, all other bytes are shifted to the left to cover the missing space from Lucas’s payload and use the remaining filler from the XOR operation to maintain the 1300 byte length. In order to send the onion packet to Mark, however, an external HMAC must be added so that Mark can verify integrity and correctness.

Lucas’s hop payload has all the information needed to construct an HTLC with Mark!

Mark performs exactly the same steps as for Lucas, and finally Kris will receive the final payload. When he receives the update_add_htlc from the payment hash he knows it’s a payment for him, so it’s the final hop. It doesn’t do anything different (or almost): it removes a layer from the onion, but doesn’t forward anything because it’s time to reply with an update_fullfil_htlc to Mark to redeem the HTLC. This in the opposite direction (Kris->Mark->Lucas->musclesatz) and all HTLC will be redeemed, updating the channel balances accordingly.

In keysend payments, it is no longer the receiver who reveals the Pre-Image to the sender, but it is the sender who embeds the Pre-Image in the hop payload: when the receiver receives the onion, it uses that Pre-Image (which corresponds to the HTLC payment hash) to settle the payment! So in this case there is no need to exchange the invoice or frame a QR 😉

Gossip #

Reference table: layer routing & layer, routing fees channel metadata & gossip replaying.

So far we have understood how to build the onion message and how to deliver it between nodes, but the question now is:

How do I build a valid route to the recipient? How do I know the information about his channel?

I need a channel graph, which is nothing more than the (interconnected) set of publicly announced channels and what nodes they connect.

The protocol used is called gossip and allows each node to share information with the Lightning network and is exploited during the creation of channels to spread `news’ like: I have created a new channel. I am informing all my other channels of this new connection. To my other peers: please spread this message, have it spread throughout the network.

The Lightning network is (almost) equivalent to the Lightning channel graph.

We are talking about a peer-to-peer network so it is essential to have an initial bootstrap phase so that nodes can see each other. The first step is to discover at least one peer that is part of this network and has a complete channel graph.

Using one of the countless bootstrap protocols I can actually connect to a peer, so now I need to download and validate the channel graph. That done, I can start opening payment channels too.

The important thing is to keep the graph up-to-date, discovering and validating new channels, eliminating closed on-chain channels and also eliminating channels from which no ‘bootstrap comes every 2 weeks or so.

How does the initial bootstrap take place?

A simple option, but impractical due to its fragility, would be to exploit a number of ‘hardcoded’ peers within the Lightning node software. This method is used as a fallback in case we cannot find peers with the other method used, called bootstrap DNS.

To do peer-discovery using DNS, BOLT #10 we use three records:

SRV records to locate a set of the node’s public keys.
A record to map the pubkey to the IPv4 address.
AAA record to map the pubkey to the IPv6 address.

When you type in “musclesatz.com” you must think of the DNS records “A” and “AAA” as maps that help you find my site online.

The “A” record is like a map for finding houses in an old neighbourhood by having the street (123 Main Street). In practice it is an IPv4 address. The “AAA” record is like a map for finding houses in a brand new, state-of-the-art neighbourhood, so far ahead that it no longer has a standard address like IPv4 but is more a series of numbers and letters.

We need both a public key and an IP address to connect to a node!

So the steps are:

I identify a DNS server that supports the Bolt #10 protocol. To do this I can use a common seed server maintained by the major Lightning network implementations.
I issue a SRV query to obtain a set of peers that I could use to bootstrap. Each of these peers is identified by a public key encoded in a format called bech32.
I decode the public keys obtained from the response to my SRV query to obtain the node identifiers.
I run a DNS query to get the IP address of the target node in combination with its identifier (the public key)
I connect to the node using a Lightning client.

I find my peer and establish the first connection.

Now I have to synchronise and validate the oriented channel graph. It is called oriented because the arrows have an associated direction, which means that a relationship from node 1 to node 2 may not be symmetrical, i.e. it does not necessarily imply an inverse relationship from 2 to 1.

Channels are UTXO (multisig 2-to-2 addresses), so we can give a new definition to the channel graph: is a subset of Bitcoin’s UTXOs. You cannot forge a graph, we will see.

Information from the graph is propagated in the network via three messages, described in Bolt #7:

node_announcement: performs the node welcome and contains information such as the identifier, the node’s capability in payment routing or its routing policies. They are not written to the timechain, and are only valid if there is a corresponding, subsequent channel_announcement message. Validating these messages is very easy: if it already existed then the new channel_announcement will have a longer timestamp; if there is no announcement then a channel_announcement must necessarily exist in the local channel graph (we’ll look at this in more detail later). It must also have:
- a valid signature.
- all included addresses ordered in ascending order according to the address identifier.
- alias bytes must have a valid UTF-8 encoding.
channel_announcement: we can now worry about announcing the presence of a new channel. If I want my node to allow other onion messages to be routed, I must announce it publicly. When a channel is not announced in theory it could not receive payments; in practice it can do so thanks to routing hints which include information to send the sender to route the onion message correctly. The actual channel graph is of unknown size precisely because of the presence of unannounced channels!

One detail: to avoid spam and ensure robust authentication, we use Bitcoin’s timechain. For a node to accept a channel_announcement such an announcement must prove the existence of the channel opening on the timechain.

How can a channel reference be indicated on the timechain? Using the full channel outpoint would be foolish because the verifier would have to have a full copy of the UTXO info set for verification, usually it is the full nodes that have all this information.

Much better to use the short channel ID. To create this reference we rely on the position of this output in the timechain. We only need to refer to a certain block, point to a transaction within it and then to a specific output created by the transaction. This is the short channel ID (or scid).

channel_update`: The third and final message in the gossip protocol. It is used to update information about a payment channel within the Lightning network such as routing tables, or to inform other nodes of changes to payment channels. This type of message allows nodes to route payments through the most efficient and up-to-date channels.

We have realised that gossip is a recurring activity and that when a node bootstraps on the Lightning network it starts to receive gossip with the messages it has just analysed; with this information it will start to build its channel graph. The more messages it receives, the more accurate and efficient its map becomes in finding the best route for sending payments.

There is also other information that can be stored: various Lightning implementations can link other metadata such as scores that evaluate the quality of a node as a peer to route payments.

Pathfinding #

Reference table: payment level, payment attempts, pathfind, path selection.

What is it and what problem do we have to solve with pathfinding?

Payment on Lightning essentially depends on finding a path connecting a sender with a receiver. This process is called pathfinding but… it is not part of the BOLT standard! It is not part of it because it does not require co-ordination or interoperability, if routing is specified in BOLT the route search and selection is left to the sender of the payment according to the algorithm of the implementation it is using.

The problem is to find the best route between sender and receiver, best in terms of delivery success, fees bass and short timelocks.

To understand the satoshi transport problem between me and Kris even better, 3 attributes need to be defined:

Capacity: is the total (maximum) amount of sats financed in the multisig address 2-di-2 .

capacity(of my channel with Kris) = balance(musclesatz) + balance(Kris)

Balance: is the amount held by each peer in the channel.
Liquidity: is the available balance that can actually be sent. The liquidity of my channel is equal to my balance but subtracted from the channel reserve and any HTLC I have outstanding.

capacity(musclesatz) = balance(musclesatz) - channel_reserve(musclesatz) - HTLCs(musclesatz)

The only public value to the entire network is the channel capacity. Balances are not announced to maintain high scalability and privacy. If they were public we would have solved pathfinding with any path-finding algorithm at minimal cost.

We have to get clever, but in the meantime we can say that at least we are sure that the liquidity of a channel is within (or equal to) the channel_reserve (in the minimum case, it is a lower bound) and the channel capacity minus the channel_reserver (upper bound)

reserve <= liquidity <= (capacity - reserve)

This is our uncertainty interval, we do not know the balances but we have a rough estimate of the liquidity which must be within the interval. The network does not know this range because only Kris and I can know it, but we could exploit the failed HTLCs of our payment attempts to update our liquidity estimate and lower the uncertainty.

If an HTLC fails because we thought it could handle N satoshi, and we find that it can only deliver M (with M < N) then we can already change the upper end of our range to M - 1.

Simplifying, we use Dijkstra’s A* (A-star) algorithm in which we have the nodes and assign a weight to the connections between the nodes. This weight is the routing fees of a payment.

However, since the liquidity of the channel is unknown to the sender but is only a theoretical range, the problem is complex. We can suspect which connections have sufficient capacity to allow the routing of the payment. I can try to sort these routes by weight and try each route in order until the payment is successful. For payments that fail I exploit the error HTLC to update my graph, fixing the limits and reducing the uncertainty because I have explored a path.

So, to summarise:

I construct the channel graph.
I do pathfinding based on some heuristics.
I perform a payment attempt.
For unsuccessful payments I adjust my interval to reduce uncertainty at the next submission.

Paradoxically I learn from mistakes because the more unsuccessful payments I have, the better I can reduce the uncertainty of my interval and have a more accurate map of possible paths. The knowledge from errors will be very useful to me for future payments.

Beware though: it is not ’eternal’ knowledge because it becomes stale as other nodes send or route payments. I will never see any of these payments unless I am the sender.

Onion routing as we have seen only allows me to see one hop. We must also consider how long to store this knowledge about the uncertainty of the various routes before it becomes useless to keep it in memory.

Multipart Payments #

Multipart Payments are a feature introduced in 2020 that amounts to a strategy of searching for the best path but breaking the payment into many small parts that will be sent as HTLC on different paths, obviously preserving the atomicity of the payment. They must all reach the recipient, otherwise there is a failure of the transaction.

They are an improvement for a very simple reason: small payments do not have to adapt to the liquidity of the channels, because there will surely be enough liquidity for that ‘small piece’. It is as if our uncertainty interval were a haystack and our small payment were a needle.

There is of course no statistical certainty that all multipart payments go through.

Security and attacks on LN #

Is Lightning private?

It depends on what we mean by privacy.

A node has visibility of its successor and predecessor. But if it is highly developed and hyper-connected, it can generate heuristics and associations as in the case of very large non-custodial wallets. The nodes of these providers are responsible for routing a very high percentage of payments to and from that node.

What would happen then if two or more nodes are under the control of a malicious adversary? If two colluding nodes are on the same payment path, they immediately realise they are routing HTLC for the same payment because the Pre-Image hash is the same. What are the risks?

Information security is most often summarised in three aspects:

Responsibility: does the secret information actually reach the intended recipient?
Integrity: has the information been altered during transport?
Availability: is the system working or is there some kind of denial of service attack going on?

The reality, not only for Lightning but for any system is that there is no certainty that an attacker will not succeed. It is a zero-sum game. Sure, you can mitigate the risk, but never zero-sum it.

Is there a difference between the privacy of Bitcoin and Lightning?

At first glance Lightning offers better privacy. In Bitcoin transactions do not associate a digital identity with a specific address but it is also true that transactions are transmitted in plain text and are analysed. Many companies have sprung up in recent years that do this for a living.

On Lightning, payments are not transmitted to the entire network and, above all, are not saved for eternity as on timechain, but there are other properties of the protocol that could create some bellyaches.

One of them? Large payments may have fewer routing ‘options’.

More? Lightning nodes maintain a permanent identity, while Bitcoin nodes do not: I can only receive and send payments from the node I used to open the payment channel on timechain. I cannot use any others. I also have to announce my IP address and ID and this creates a permanent link between node IDs and IP addresses. IPs are intermediate passes in the attack on anonymity and are often linked to a user’s physical location in the real world.

Finally, Lightning users can be denied service by blocking channels or exhausting them. Forwarding payments requires the balance (a scarce resource) to be blocked in HTLCs along the way. An attacker could launch so many payments without finalising them, occupying the capital for a long time (but this also depends on the timelock delay set).

In short, we can decree that on the privacy side there is no winner between Bitcoin and Lightning, each has its own weaknesses and strengths. What we can say is that many research teams are currently working on improving the security and privacy aspects on both.

As the last part of this article, I will give a roundup of possible attacks that Lightning may suffer or has suffered. We are almost at the end.

Connection of sender and recipient #

In Lightning, the anonymity of participants is essential to guarantee the freedom of payment, this concept being one of the pillars for which Bitcoin was born. But a malicious user could attempt to violate this privacy by discovering the sender and recipient of a payment. The reasons for this could be many and varied. Not only would security in payments be compromised, but also censorship from or to certain recipients or senders would be encouraged.

We can assume that there are two types of adversaries that can try to violate anonymity in Lightning: ‘off-path’ adversaries and ‘on-path’ adversaries. The ‘off-path’ ones attempt to deduce the sender and recipient of a payment without directly participating in the payment routing process while the others can exploit the information they obtain during the payment process to discover the sender and recipient.

The former, the off-path ones, use a technique called ‘probing’ or ‘probing’ to deduce individual balances in the various payment channels. When they obtain this information, the attacker can compare snapshots of the network at time t1 and t2 to identify payments that have occurred and discover the sender, recipient and payment amount. For example, it takes a snapshot of the network at time t1 = 21:00 and takes another snapshot at 21:01. If there was only one payment, it is trivial to find out who are the parties involved and their amount, also because there is the channel status change that has to be updated every time the balance changes! If there have been multiple transactions it becomes more complex, which is why these companies have developed heuristics to decouple the various payments.

Those on the route, on the other hand, actively participate in the routing of payments and can observe payment information. Their skill lies in excluding certain nodes from the sender’s or receiver’s anonymity set using information obtained during the payment process. For example, an intermediary node can observe the amount of any payment being routed and the various deltas of the timelock and exclude any node from the anonymity set that has been set with less capacity than the amount of the payment being routed.

Denial of Service #

Exposing online resources means increasing the risk of making them unavailable due to a possible denial of service. How would an attacker deny me service? Trivial. A bombardment of useless requests, indistinguishable from legitimate ones, is made. There are no losses in terms of funds and they only serve the purpose of sending the target victim of the DoS offline. Falling into Lightning, this network charges fees for the use of public resources, but in this case the channels are public and the fees are routing fees.

When a node submits a payment for me, this node uses its data and bandwidth to update the status of the channel and among other things the amount is parked (HTLC) until it is settled or until it expires (timelock). Payments that fail do not cost any fees so it is great for us, the poor legitimate users of the network. Less great for attackers because it allows them to bombard the network with routing requests at no cost.

A conceptual variant of DoS is the topology-based attack. Lightning is not perfectly decentralised so to ’take out’ nodes from payment routing, it would make no sense for an attacker to target those on the edge of the channel graph. It makes much more sense to attack medium/large nodes connected to lots of other nodes.

Channel disruption #

Each node can handle a maximum of 483 HTLCs. In a channel disruption attack, the attacker routes exactly 483 payments through a target channel. What could possibly happen?

A similar version concerns liquidity blocking in which the attacker uses large HTLCs consuming all the available bandwidth of the victim channel. This is certainly much more expensive than sending 483 HTLCs.

Cross-layer deanonymisation #

The creation and closing of Lightning channels occurs on the timechain and this makes it easier for an attacker to deanonymise a user using both off-chain and on-chain data. What are the goals for these attackers?

Create a cluster of Bitcoin addresses that are owned by the same user on layer 1.
- Countermeasure 1: Do not reuse transaction funding outputs to open new channels, avoid many heuristics.
- Countermeasure 2: never use unique external sources for funding, better to vary.
Create a cluster of Lightning nodes owned by the same user on layer 2.
- Here I want to ask you a question: using aliases is great, they simplify life and improve usability. Undoubtedly. But if I use my own domain name as an alias, do you think this is a good or a bad idea? If you think it’s a bad idea, what is the countermeasure to take?
Create a unique link between Lightning nodes and and addresses in timechain.
- Doing the opposite of the two countermeasures above easily allows the cross-layer link between Lightning node and entity managing addresses on layer 1.

Can it be protected? #

It is not easy but it can be done.

Unannounced channels if you do not want to do routing and do not export.
If you want to do routing:
- Set the minimum HTLC size you are willing to accept to a higher value than the default. This way, each slot cannot be occupied by a payment that is too small (avoid spam).
- Limit the maximum number of slots a peer can consume.
- Monitor error rates and if a peer is above a certain rate, limit its frequency.
- Use shadow channels, they are parallel payment channels to existing ones. They can be used for routing but not announced.
Try to find the right balance between accepting and rejecting channels. When a peer opens a channel on your node you cannot know if it will be used to attack your node or not.
Avoid using the same alias for your node and your domain (this is the answer to the question above 😜).
Use Tor.
Keep reading and informing yourself, don’t rely solely on this article.

Bye! #

I hope you enjoyed this excursus. If you find mistakes, inaccuracies or simply want to improve this article, you can contact me X/Twitter or on telegram.
If you think you have learnt something new, please share this article or consider a donation. Besides rewarding me personally, it helps me keep this site up and gives me the drive to produce new content!