I discovered a DoS issue in Python’s marshal module. However, the Python community decided that this is not an (security) issue as it is already documented in marshal module that untrusted data is not supposed to be fed to that.

So, here is a way to segfault the Python interpreter, if anyone needs a reliable way to do it.

Last tested in Python 3.7.3 in Arch Linux on 16th July 2019. Should be possible with Python2 too. Scroll down to the end if you are impatient and just want to see code.

Description

By passing a malformed string as input to marshal.loads() an attacker can trigger a null pointer dereference resulting in DoS.

This happens because when a Python object is unmarshalled by reference, it is assumed that the target object is fully constructed. We can construct a marshal string such that it can reference partially constructed Python objects.

Example

tuple(FrozenSet(REF(0)))

Tuple -> FrozenSet -> REF(0)

When unmarshalling of the tuple object starts, a new PyTuple_New() object is created and its address is added to p->refs array before starting to parse and load all its children elements in a loop. A FrozenSet can be added as 0th element of this tuple. And then add the 0th element of this FrozenSet as p->refs[0]. After an element is added to FrozenSet, it tries to hash it believing that it is a completely constructed Python object.

While it tries to hash the original tuple, it does not have any valid addresses in ob_item array. This results in a null pointer dereference throwing a SIGSEGV and crashing of interpreter.

Running the below script results in a segmentation fault.

#!/usr/bin/env python3

import marshal
marshal.loads(b"\xa9\x01\xbe\x01\x00\x00\x00r\x00\x00\x00\x00")