Go and Python SHA-256 Challenges and Learnings
Intro
One of the mundane works we developers do is integrating a service into another one, seamlessly preferred.
If it is a third-party service with lacking or obscure documentation then you are in for a treat, or thread?. Fortunately, both services are written and maintained by me so I can easily integrate, test, and debug it.
I won’t go into much details about the services but let’s just call them A
and B
for the sake of simplicity. As usual, when we write services we also provide an interface like an API (REST) and then we add layers into it especially one for security.
For these services, I went with the hash signature
route since it is easier to do as well as provides good security.
Investigation of the case
Enter sha256
which is undoubtedly more secure than typical MD5
or plain base64
encoding.
I won’t go into too much details of course since that will defeat the purpose of security quite a bit so let’s just jump straight into the intricacies of making service A
(written in Python) produce the expected signature of service B
(written in Go).
In Go, I have this simplified functionality (added comments for short explanations)
func GetSignature(secret string, vals ...string) string {
//Write the values into the buffer
buf := bytes.NewBuffer(make([]byte, 0, 128))
for _, val := range vals {
buf.WriteString(val)
}
//Write the buffer into the sha256 struct
h := sha256.New()
if _, err := h.Write(buf.Bytes()); err != nil {
logger.Log().Error("GetSignature", zap.Error(err))
return ""
}
//use base64 to produce url-safe encoding of the resulting hash + secret
hashed := base64.URLEncoding.EncodeToString(h.Sum([]byte(secret)))
return hashed
}
Here’s the Python one which I thought would be a breeze:
def get_signature(secret: str, vals: str) -> str:
#write the value (in bytes, using encode) into the sha256 object
h = sha256()
h.update(vals.encode())
#use base64 to produce url-safe encoding of the resulting hash + secret
enc: bytes = urlsafe_b64encode(h.digest())
return enc.decode(enc)
Note that at this point, I decided to never change the implementation in the Go side. This minimizes the debugging and workaround needed.
The Python code obviously does not work because the secret
key is not even used. Looking at the sha256
module, there is no equivalent of Go’s Sum
function. I thought extend
would suffice.
Most sha256
resources online show h.Sum(nil)
usage but I decided to go for passing []byte(secret)
.
I added h.update(secret.encode())
after h.update(vals.encode())
but to no avail.
I won’t show the details but I inspected the bytes (in decimals) in both the Go and Python version and found out that everything is equal when secret
is not in the equation. It is exactly the secret
component that we need to solve. It is time to read what Sum
really does.
The tricky part here was understanding the Sum
function. The documentation of Sum
functionality is:
Sum appends the current hash to b and returns the resulting slice`
So basically []byte(secret) + current hash state
.
Here is the Python code after that incomplete understanding:
h.update(secret.encode())
h.update(vals.encode())
enc: bytes = urlsafe_b64encode(h.digest())
return enc.decode(enc)
But it still yielded a different result. I tried trying other encoding
like ascii
, utf-16
, and so on.
Interesting…
One of the low-level and hacker-y thing to do in cases like this is to go in the memory representation of the variables, a debugger would really be helpful but of course I went with print
debugging instead of setting up debugger for Python.
I tried Python’s encode
and bytearray(x)
functions but they just print the string version… Eventually I found about memoryview(input_str.encode()).tolist()
to see the bytes array of the hash state.
Why Python made that part harder or with simpler module/function beats me, oh well.
There was something off with the bytes… Time for matrix in the brain moment:
h.update(vals.encode())
h2 = sha256()
h2.update(secret.encode())
h2.update(h.digest())
enc: bytes = urlsafe_b64encode(h2.digest())
return enc.decode(enc)
Still incorrect but my low-level programmer brain senses that I am so close…
The key phrase (uppercased) in Sum
documentation is:
appends the CURRENT HASH to B (BYTES) and returns the resulting slice
which means that secret
is not supposed to be hashed by sha256
.
h.update(vals.encode())
b = bytearray()
b.extend(secret.encode())
b.extend(h.digest())
enc: bytes = urlsafe_b64encode(h.digest())
return enc.decode(enc)
There you go!
Soli Deo Gloria