| summary |
rdiscount has an Out-of-bounds Read
### Summary
A signed length truncation bug causes an out-of-bounds read in the
default Markdown parse path. Inputs larger than `INT_MAX` are truncated
to a signed `int` before entering the native parser, allowing the
parser to read past the end of the supplied buffer and crash the process.
### Details
In both public entry points:
- `ext/rdiscount.c:97`
- `ext/rdiscount.c:136`
`RSTRING_LEN(text)` is passed directly into `mkd_string()`:
```c
MMIOT *doc = mkd_string(RSTRING_PTR(text),
RSTRING_LEN(text), flags);
```
`mkd_string()` accepts `int len`:
- `ext/mkdio.c:174`
```c
Document
* mkd_string(const char *buf, int len, mkd_flag_t flags)
{
struct string_stream about;
about.data = buf;
about.size = len;
return populate((getc_func)__mkd_io_strget, &about, flags & INPUT_MASK);
}
```
The parser stores the remaining input length in a signed `int`:
- `ext/markdown.h:205`
```c
struct string_stream {
const
char *data;
int size;
};
```
The read loop stops only when `size == 0`:
- `ext/mkdio.c:161`
```c
int __mkd_io_strget(struct string_stream *in)
{
if ( !in->size ) return EOF;
--(in->size);
return *(in->data)++;
}
```
If the Ruby string length exceeds `INT_MAX`, the value can truncate
to a negative `int`. In that state, the parser continues incrementing
`data` and reading past the end of the original Ruby string, causing
an out-of-bounds read and native crash.
Affected APIs:
- `RDiscount.new(input).to_html`
- `RDiscount.new(input).toc_content`
### Impact
This is an out-of-bounds read with the main issue being reliable
denial-of-service. Impacted is limited to deployments parses
attacker-controlled Markdown and permits multi-GB inputs.
### Fix
just add a checked length guard before the `mkd_string()`
call in both public entry points:
- `ext/rdiscount.c:97`
- `ext/rdiscount.c:136`
ex:
```c
VALUE text = rb_funcall(self, rb_intern(\"text\"), 0);
long text_len = RSTRING_LEN(text);
VALUE buf = rb_str_buf_new(1024);
Check_Type(text, T_STRING);
if (text_len > INT_MAX) {
rb_raise(rb_eArgError, \"markdown input too large\");
}
MMIOT *doc = mkd_string(RSTRING_PTR(text), (int)text_len, flags);
```
The same guard should be applied in `rb_rdiscount_toc_content()`
before its `mkd_string()` call. |