A while ago I came across Huff, a domain-specific language to write highly optimized smart contracts, that directly use EVM byte codes.
I’ve always taken an interest in low-level stuff, despite being a bit hard to follow (or very hard at some points) you get a sort of superpower once you get a grip on them.
So, in this tutorial, I will show you how to use Huff to write a simple storage contract.
What we are going to build
It is a very simple contract, we need a function to take a value and write it to storage, then we need another function to retrieve that value.
Now let’s go over the components of a Huff contract
First, we have the contract interface. This is very similar to solidity interfaces, you just write the function without providing any implementation
In our case, we want two functions, one to write and one to read.
// interface
#define function setValue(uint256) nonpayable returns ()
#define function getValue() nonpayable returns (uint256)
We use #define
to define almost everything in Huff. We mark both functions nonPayable
because we are not going to send any native funds.
Huff is a stack-based language, which means we add or remove values on the stack.
We want to write to a specific storage slot, Huff has a built-in command that gives us access to storage slots in a linear fashion.
This means that the first time we call that command, we get storage slot 0x00..000
. If we call it again, we get storage slot 0x00..001
The command is FREE_STORAGE_POINTER
.
In our case, we just want one slot to store and read from. So, we are going to store that slot in a constant variable.
#define constant slot_0x00 = FREE_STORAGE_POINTER()
Now our constant slot_0x00
will give us access to the first storage slot, the way we reference it is by putting it inside square brackets:
[slot_0x00
].
Now let’s start with the implementation of our setValue() function:
#define macro SET_VALUE = takes(0) returns(0) {
0x04 // 1
calldataload // 2
[slot_0x00] // 3
sstore // 4
}
You can think of the macro name as the internal functions in solidity, they often have the same name as external functions, but they start with an underscore.
In Huff, The convention is to name the macro in all caps and use underscores to separate words.
takes(0)
means our implementation will not read anything from the stack
returns(0)
means that we will not be pushing anything to the stack.
we push
0x04
to the stack, we need this to be able to use the opcodecalldataload
. This allows us to read the calldata sent by the function starting after the first 4 bytes. We don't need the first 4 bytes, since they contain the function selector.Stack layout: [ 0x04 ]
calldataload
pops the first element on the stack, and uses it as offset to start reading from. This opcode will read 32 bytes starting from the offset and push them to the stack. This means we pushvalue
, i.e. what we want to store.Stack layout: [ value ]
We push
[slot_0x00]
to the stack, which is our reference to storage.
Stack layout: [ storage_ptr, value ]
We use the opcode
sstore
this will pop the top two elements from the stack.Then it will use the first element (`
storage_ptr
`) as the storage slot, and the second element (value
) as the value to write. That is exactly what we need.Stack layout: [value]
Storage Layout : [0x00 ====> Value]
Now, let's write the getValue() implementation:
We start with the macro definition, we are not going to read anything from the stack or write anything to it at the end of the function.
#define macro GET_VALUE() = takes(0) returns(0) {
[slot_0x00] // 1
sload // 2
0x00 // 3
mstore // 4
0x20 // 5
0x00 // 6
return // 7
}
[slot_0x00]
, we first push our storage pointer to the stack.Stack layout: [storage pointer]
sload
pops the first element from the stack ( our storage pointer ) and pushes back the storage value it contains.Stack layout: [value]
We push
0x00
to the stack. This will be the slot in memory in which we start loading our value. The value isuint256
, so it occupies one slot.Stack layout: [0x00,value]
The opcode
mstore
will pop the top two elements. It will store the second element (value
) in the memory slot indicated by the first element (0x00
)Stack layout: []
Memory layout: [0x00 ===> value]
To return the value, we need to tell the EVM where our return value starts in memory, and how long it is.
Our return value occupies 1 slot (because it is a uint256), and it starts at slot 0x00
in memory.
we push
0x20
to the stack, this is 32 in hex. And it is the length of our return value.We push
0x00
, this is the slot to start returning from.Stack layout:[0x00,0x20]
Memory layout: [0x00 ===> value]
Finally, we use the opcode
return
. This will pop the first two elements from the stack, and will start reading from the slot0x00
in memory (first popped value), and will read0x20
bytes (second popped value)
That's it for our core logic!
Now we want a way to expose our functions to the world.
This is called dispatching, and we use a MAIN macro for this. And they all start like this
#define macro MAIN() = takes(0) returns(0) {
0x00 // 1
calldataload // 2
0xe0 // 3
shr // 4
}
Push
0x00
to the stack.Stack layout:[0x00]
calldataload
reads thecalldata
starting from0x00
. Now we have the full `calldata` on the stack, including the function selector, which is what we want.Stack layout:[functionSelectorParam1..]
We need to manipulate the
calldata
in such a way that we only have the function selector on the stack. We will use shifting for that.We push
0xe0
, this is 224 in hex. Sincecalldata
is 256 bits (32 bytes), and we want the leftmost 32 bits (4 bytes), we will shiftcalldate
to the right by 224. So the right-most value is now the function selector.Stack layout:[0xe0, functionSelectorParam1..]
shr
, pops the first two elements, and shifts the second elementcalldata
, by the value of the first element0xe0
or 224 bits. Then it pushes the result back to the stackStack layout:[function_selector]
Now, all we need to do is to find a match for this function selector in our interfaces.
To do that, we use a technique called linear matching
. We compare what we have against all the selectors in our contract, and jump to the first match we find.
This allows us to put all the "hot" functions at the very top, so we reduce the likelihood of going through all the functions we have.
So let's do it!
#define macro MAIN() = takes(0) returns(0) {
...
dup1 // 1
__FUNC_SIG(setValue) // 2
eq // 3
setValue //4
Jumpi // 5
}
dup1
duplicates the top element on our stack, which is our function, and pushes it. selector. Will explain why in a bit.Stack layout:[function_selector,function_selector]
__FUNC_SIG(setValue)
pushes the selector of the functionsetValue(uint256)
to the stack for us to compare.Stack layout:[setValue_selector, function_selector, function_selector]
The
eq
op code pops the top two stack elements and compares them. It then pushes1
if they are equal or0
if they are not. Let's assume they are equal.Stack layout:[1, function_selector]
Remember how we duplicated the selector in
step 1
usingdup1
? we did that cause we needed two copies of the selector. One to pop for comparison, and one to use to jump to the function in case we matched.
Jumpi
pops the top two elements, and if the first one is 1
, we jump to the label indicated by the second element: setValue
!We then pass execution to our function and let it handle the rest!
Now we need to add our label to the main macro
#define macro MAIN() = takes(0) returns(0) {
...
setValue: // 1
SET_VALUE()
Getvalue: // 2
GET_VALUE()
}
Under the
setValue
label, we add the macro name for that function :SET_VALUE
Under the
getValue
label, we add the macro name for that function :GET_VALUE()
Finally, here is what our main macro looks like:
#define macro MAIN() = takes(0) returns(0) {
// fetching the selector
0x00 calldataload 0xe0 shr
// linear matching
dup1 __FUNC_SIG(setValue) eq setValue jumpi
dup1 __FUNC_SIG(getValue) eq setValue jumpi
// labels
setValue:
SET_VALUE()
Getvalue:
GET_VALUE()
0x00 0x00 revert
}
If we don't find any match, we will revert using: 0x00 0x00 revert
Yaaay! we did it! Here is the full code:
// interface
#define function setValue(uint256) nonpayable returns ()
#define function getValue() nonpayable returns (uint256)
// storage constant
#define constant slot_0x00 = FREE_STORAGE_POINTER()
// the function setValue
#define macro SET_VALUE = takes(0) returns(0) {
0x04 // 1
calldataload // 2
[slot_0x00] // 3
sstore // 4
}
// the function getValue
#define macro GET_VALUE() = takes(0) returns(0) {
[slot_0x00] // 1
sload // 2
0x00 // 3
mstore // 4
0x20 // 5
0x00 // 6
return // 7
}
#define macro MAIN() = takes(0) returns(0) {
// fetching the selector
0x00 calldataload 0xe0 shr
// linear matching
dup1 __FUNC_SIG(setValue) eq setValue jumpi
dup1 __FUNC_SIG(getValue) eq setValue jumpi
// labels
setValue:
SET_VALUE()
Getvalue:
GET_VALUE()
0x00 0x00 revert
}
Let me know in the comments if you have any questions!