browser_id: 1fb6fe4ead6815a993c7074d9ecdc5355f42e333
video_id: MCGL8ABJUdv7
session_pk: 120961
1
00:00:10,480 --> 00:00:19,480
So welcome. ...And especially let me know if I speak too quickly. Um, so -- who I am -- oh, yes so
2
00:00:19,480 --> 00:00:28,259
I will talk about opcodes and a little bit about the PDF file format and their oddities. So, I've been
3
00:00:28,259 --> 00:00:35,085
a reverse engineer for some years, for some time. I created a project called Corkomi.
4
00:00:35,085 --> 00:00:42,400
Also in the past I worked on the MAME arcade emulator, and professionally I am a malware analyst, but
5
00:00:42,400 --> 00:00:48,181
this is only on the behalf of my hobbies, this is my own experiments and research at home.
6
00:00:48,181 --> 00:00:57,107
So, I introduced Corkomi. Corkomi is just the name of the project I created for ???? project.
7
00:00:57,107 --> 00:01:03,739
I tried to keep it just to the technical stuff, no ads, no login required.
8
00:01:03,739 --> 00:01:06,572
Really direct to the good stuff.
9
00:01:06,572 --> 00:01:12,400
I try to update it and make it useful, so I ??? created cheat sheets and the kind of easy to (command?)
10
00:01:12,400 --> 00:01:15,813
that I would use for work on a daily basis,
11
00:01:15,813 --> 00:01:18,646
but it's only a hobby; I do that once the kids are asleep
12
00:01:18,646 --> 00:01:23,200
and late at night so it's probably doesn't look professional
13
00:01:23,200 --> 00:01:25,280
and as good as I would like it to be.
14
00:01:25,280 --> 00:01:30,581
So right now, Corkami, the form of Corkami, is wiki pages and cheat sheets
15
00:01:30,581 --> 00:01:37,593
and I focus on creating as many as possible relevant proof of concepts
16
00:01:37,593 --> 00:01:42,888
so the binaries are (110?) usually I don't use a compiler, I create the (P?) myself
17
00:01:42,888 --> 00:01:46,400
so that it's only focusing on the exact interesting point
18
00:01:46,400 --> 00:01:49,040
and you don't have a lot of noise even -- you don't probably
19
00:01:49,040 --> 00:01:51,400
need IDA to actually understand what's going on
20
00:01:51,400 --> 00:01:54,985
because I try to focus only on what's important.
21
00:01:54,985 --> 00:01:58,282
The binaries are all directly available to download so you can
22
00:01:58,282 --> 00:02:01,301
really test your debugger, your tools, your knowledge
23
00:02:01,301 --> 00:02:03,840
and just get them directly from that.
24
00:02:03,840 --> 00:02:07,680
So far, I've focused on the PDF assembly and the PDF
25
00:02:07,680 --> 00:02:11,240
file format. A few other stuff, but that's mainly the most
26
00:02:11,240 --> 00:02:15,080
covered subject of my website. And I share that with a
27
00:02:15,080 --> 00:02:19,840
very permissive license so BSD you can reuse them commercially
28
00:02:19,840 --> 00:02:24,614
whatever. Even the images are done in open-source format.
29
00:02:24,614 --> 00:02:29,360
So the story behind this presentation is that some time ago
30
00:02:29,360 --> 00:02:32,400
I was young and innocent and I thought that CPUs being
31
00:02:32,400 --> 00:02:38,453
electronic that they had to be perfectly logical and no problems
32
00:02:38,453 --> 00:02:41,611
and then I was tricked by malware. And basically
33
00:02:41,611 --> 00:02:46,600
IDA basically wasn't able to work on it, so I decided to go back
34
00:02:46,600 --> 00:02:49,784
to the basics and study assembly and PDF files from scratch.
35
00:02:49,784 --> 00:02:52,920
I created in the meantime documents on Corkami
36
00:02:52,920 --> 00:02:57,667
and now I'm presenting you more or less the final results.
37
00:02:57,667 --> 00:03:01,630
or the (good progress results?). If I wasn't -- if I was just a
38
00:03:01,630 --> 00:03:05,481
guy who learned assembly I probably wouldn't be in Hash
39
00:03:05,481 --> 00:03:10,589
Days to talk about it, if I didn't get a few achievements from
40
00:03:10,589 --> 00:03:13,880
various tools. So basically I (failed?) all the disassemblers that I tried
41
00:03:13,880 --> 00:03:22,106
and I also created a few crashes (in IDA?). I insist that all
42
00:03:22,106 --> 00:03:26,100
the authors were notified and most of the bugs are already fixed, but
43
00:03:26,100 --> 00:03:30,920
basially it was like this in 6.1 -- you get a direct crash -- but
44
00:03:30,920 --> 00:03:33,345
now it's fixed in 6.2, and everything.
45
00:03:33,345 --> 00:03:37,480
And (?) that's the latest version, but the newest and released
46
00:03:37,480 --> 00:03:41,440
, well the newest beta I fixed that and so on.
47
00:03:41,440 --> 00:03:44,880
So the agenda for the presentation is that I first try with
48
00:03:44,880 --> 00:03:50,853
and easy introduction, but I assume that most of you already know or are familiar with disassembly, right?
49
00:03:50,853 --> 00:03:57,354
Yes. And another question: are you all familiar with
50
00:03:57,354 --> 00:04:02,927
or you already had an event of undocumented disassembly in your (or never?).
51
00:04:02,927 --> 00:04:06,178
Like you trust IDA, or is that all.
52
00:04:06,178 --> 00:04:10,560
Like, is it a common thing to have an undocumented disassembly in IDA?
53
00:04:10,560 --> 00:04:14,360
Raise you arms -- okay, not so much.
54
00:04:14,360 --> 00:04:19,560
Okay. So then after the introduction (that will go quickly),
55
00:04:19,560 --> 00:04:25,360
I will mention a few tricks, then introduce CoST, the program that I created.
56
00:04:25,360 --> 00:04:28,960
And I will also talk a little bit more about the PDF file format.
57
00:04:28,960 --> 00:04:34,088
So as you all have assembly knowledge I will go quickly on that.
58
00:04:34,088 --> 00:04:37,920
So basically, you compile binary, there is assembly, there is
59
00:04:37,920 --> 00:04:44,259
some relevance, some common points between the [binary] code and the assembler code.
60
00:04:44,259 --> 00:04:48,763
Then of course there is a relation between the opcode and the [machine] code, you all know that.
61
00:04:48,763 --> 00:04:53,600
What is important is the assembly is generated by the compiler, but actually what is
62
00:04:53,600 --> 00:04:59,770
then from the assembly what is (???) only kept in the binary are the opcodes itself which are understood
63
00:04:59,770 --> 00:05:03,345
directly by the CPU, which means the CPU just knows
64
00:05:03,345 --> 00:05:07,400
what to do with the bytes, it doesn't care if you or the
65
00:05:07,400 --> 00:05:10,880
tool you're using knows what it will do, because it just does it.
66
00:05:10,880 --> 00:05:16,163
And the problem is that what we read is not usually the opcodes for most people but actually the disassembly
67
00:05:16,163 --> 00:05:20,600
and if the disassembler doesn't give you any result, well,
68
00:05:20,600 --> 00:05:25,440
we're stuck, we're blind, we don't know what execution will do.
69
00:05:25,440 --> 00:05:28,098
And the other problem is because of the opcode length you
70
00:05:28,098 --> 00:05:30,327
don't know what the next instruction will be because you
71
00:05:30,327 --> 00:05:32,324
don't know how to disassemble it.
72
00:05:32,324 --> 00:05:40,172
So, here just create one undocumented opcode in a simple program.
73
00:05:40,172 --> 00:05:48,640
So basically we just emit a keyword in -- that's Visual Studio 2010 ultimate --
74
00:05:48,640 --> 00:05:52,247
you will get a byte that is unidentified at disassembly
75
00:05:52,247 --> 00:05:58,440
so you get question marks, so basically this program
76
00:05:58,440 --> 00:06:01,767
even though it costs several thousand dollars is not able
77
00:06:01,767 --> 00:06:05,018
to -- it doesn't know what will happen.
78
00:06:05,018 --> 00:06:09,104
So usually if you do that... Oh, yeah, if you check the Intel documentation
79
00:06:09,104 --> 00:06:14,259
there is nothing to see, the D6 opcode, there is nothing to see there.
80
00:06:14,259 --> 00:06:17,919
Microsoft doesn't say anything, Intel doesn't say anything,
81
00:06:17,919 --> 00:06:21,132
so usually if you try that you could expect a bad result.
82
00:06:21,132 --> 00:06:26,560
So, not documented directly: usually it is a crash or not the expected result.
83
00:06:26,560 --> 00:06:29,724
But here, in this case, this specific case, no problem.
84
00:06:29,724 --> 00:06:35,297
We don't know what is was, if we follow Intel or Microsoft documentation, we don't know what happened.
85
00:06:35,297 --> 00:06:41,287
But if we -- the CPU just does its stuff. So what happened is that actually
86
00:06:41,287 --> 00:06:49,136
D6 is a very simple opcode, that doesn't do much, but somehow it's not documented by Intel
87
00:06:49,136 --> 00:06:53,965
[but] it's documented by AMD, and most of the opcodes are actually documented by AMD
88
00:06:53,965 --> 00:06:58,200
but not Intel. I don't know why, if anyone has any idea why...
89
00:06:58,200 --> 00:07:04,136
It's quite a trivial opcode, but it's not -- Intel says there's nothing there. Okay.
90
00:07:04,136 --> 00:07:08,320
So it's commonly used, the common use for those undocumented opcodes are malware
91
00:07:08,320 --> 00:07:13,320
and unpackers, just to prevent automated analysis or easy reverse-engineering.
92
00:07:13,320 --> 00:07:22,294
What's funny is, Intel, if you followed the documentation you will have many holes, but Intel's own disassembler,
93
00:07:22,294 --> 00:07:25,219
Z, which is free of use, it is not open source, but just under
94
00:07:25,219 --> 00:07:35,576
all these opcodes correctly, while Microsoft, and Visual Studio, and WinDBG, they follow blindly the documentation.
95
00:07:35,576 --> 00:07:43,080
So you will get question marks even though Intel knows perfectly what it does.
96
00:07:43,080 --> 00:07:52,340
So it's like "[...] do as I disassemble and don't read my documentation."
97
00:07:52,340 --> 00:08:01,160
So of course you could argue that Microsoft and WinDBG is only made to debug what the compiler, what
98
00:08:01,160 --> 00:08:08,083
the Microsoft compiler created, but then it kind of rules out WinDBG as a malware debugging tool,
99
00:08:08,083 --> 00:08:17,600
because you just inserted D6, it's trivial, and WinDBG is just not able to tell you what the instructions
100
00:08:17,600 --> 00:08:25,591
are. So it's not very useful for malware analysis.
101
00:08:25,591 --> 00:08:32,760
So, another problem that happens is that of course each of the
102
00:08:32,760 --> 00:08:37,640
undocumented things things, facts, are available, maybe one
103
00:08:37,640 --> 00:08:42,403
you will have in a trojan, one in an unpacker, but it's not so easy
104
00:08:42,403 --> 00:08:46,582
to find a good, exhaustive, clean test set to actually
105
00:08:46,582 --> 00:08:50,251
gather all these undocumented facts, so for example if you
106
00:08:50,251 --> 00:08:53,502
if someone says, a colleague mentions an undocumented
107
00:08:53,502 --> 00:08:55,880
opcode or behaviour, and then you say "oh yeah, it's
108
00:08:55,880 --> 00:08:58,720
in my books, or you skim this part of the file or whatever",
109
00:08:58,720 --> 99:59:59,000
and then you are actually, you know first it's a malware so you can
110
99:59:59,000 --> 99:59:59,000
not really spread that, and then there is a lot of noise -- the malware payload or something before and
111
99:59:59,000 --> 99:59:59,000
after -- so it's not so easy to analzyse. So that's why I focused on creating a small and clean test
112
99:59:59,000 --> 99:59:59,000
set that would actually provide a (???) on one particular instruction or fact.
113
99:59:59,000 --> 99:59:59,000
So, now let's start, at last, the real stuff, and a few of the undocumented opcodes.
114
99:59:59,000 --> 99:59:59,000
But before I actually start it, wondering what the actual possibilities of the CPUs, I didn't even know
115
99:59:59,000 --> 99:59:59,000
what are the possibilities, what are the opcodes that are even supported or not by the
116
99:59:59,000 --> 99:59:59,000
CPU.
117
99:59:59,000 --> 99:59:59,000
And I think it's a bit like English, everybody, or most people in the world, would be able to read and
118
99:59:59,000 --> 99:59:59,000
understand these words, and if you see someone's disassembly then well you are used to seeing these opcodes,
119
99:59:59,000 --> 99:59:59,000
they are made by all the compilers and they are so common that if they are not here then we are a bit
120
99:59:59,000 --> 99:59:59,000
ill-at-ease, and if it is something different then we probably would be suprised.
121
99:59:59,000 --> 99:59:59,000
So this is standard English, but the Intel CPUs were made in the 70s, so it'd be the same as if you take
122
99:59:59,000 --> 99:59:59,000
Shakespearean English, so you could say that it's still English, but mmm... You know, I don't know what that means
123
99:59:59,000 --> 99:59:59,000
or maybe I forgot, I quickly forgot at least, and it's the same
124
99:59:59,000 --> 99:59:59,000
for those opcodes which are still supported by all the CPUs that we have -- all the Intel CPUs -- but
125
99:59:59,000 --> 99:59:59,000
we probably don't know what they actually do, and that's a problem.
126
99:59:59,000 --> 99:59:59,000
I actually made, one of the proof of concepts that I made was only using these old opcodes, and these
127
99:59:59,000 --> 99:59:59,000
old opcodes are actually doing something, so if someone is familiar with reading that, maybe I should
128
99:59:59,000 --> 99:59:59,000
ask "how old are you?", because myself I am used to the PUSH(?), JUMP(?) calls, but when it's about this,
129
99:59:59,000 --> 99:59:59,000
mmm... what is exactly being done. And it's still working on an i7, and it's still usable by malware
130
99:59:59,000 --> 99:59:59,000
packers or anything, and yet some of them are totally unused now and they are still fully working on
131
99:59:59,000 --> 99:59:59,000
modern CPUs.
132
99:59:59,000 --> 99:59:59,000
And of course, it's a bit like English, it's an evolving language, and a bit like maybe the oldest generations
133
99:59:59,000 --> 99:59:59,000
of people of humans where they are not used to the new buzzwords the latest buzzwords.
134
99:59:59,000 --> 99:59:59,000
These opcodes they are sometimes present in the most recent CPUs, so, and you have direct opcodes for
135
99:59:59,000 --> 99:59:59,000
CRC32 or AES decryption, string matching, and then some complex operation, in just one opcode.
136
99:59:59,000 --> 99:59:59,000
So this, this is possible, this exists in modern CPUs. Not all of them, of course.
137
99:59:59,000 --> 99:59:59,000
One thing that I like is the MOVBE -- move big endian -- opcode, because move big endian is the rejected
138
99:59:59,000 --> 99:59:59,000
offspring, it's only implemented in the Atom CPU, which means this netbook has support for this opcode
139
99:59:59,000 --> 99:59:59,000
and the i7 64-bit doesn't have this opcode, even though it will have CRC32 or maybe AES code, so much
140
99:59:59,000 --> 99:59:59,000
for complete backward compatibility.
141
99:59:59,000 --> 99:59:59,000
There is no physical CPU as far as I know that can execute CRC32 and will move big endian.
142
99:59:59,000 --> 99:59:59,000
And of course, move big endian is quite meaningless itself because you already have an opcode for the
143
99:59:59,000 --> 99:59:59,000
endian-ness swapping. So I don't know, this small computer has an opcode that most PC's don't.
144
99:59:59,000 --> 99:59:59,000
Okay. Why? I don't know. If you know...
145
99:59:59,000 --> 99:59:59,000
[Audience member:] "Is this opcode documented in the CPU feature set?"
146
99:59:59,000 --> 99:59:59,000
Yeah.
147
99:59:59,000 --> 99:59:59,000
Yeah, it's totally -- this MOVBE -- it's totally documented, it's official.
148
99:59:59,000 --> 99:59:59,000
[Audience member:] "But, no; is it like a CPU flag just for this instruction or is it implicit by 'this
149
99:59:59,000 --> 99:59:59,000
is an Atom CPU'?"
150
99:59:59,000 --> 99:59:59,000
Uh... Yeah, I don't know. I check the value for CPUID but I don't know if it's relevant to the... but
151
99:59:59,000 --> 99:59:59,000
I think it's by itself. The opcode (???). But the CPUID result is so big that I don't remember it all.
152
99:59:59,000 --> 99:59:59,000
Uh, another thing, bit specific to those in my case, because I focus on malware, is that before you do
153
99:59:59,000 --> 99:59:59,000
actually any opcode, I was focusing on what are the register values when you start a program, and I found
154
99:59:59,000 --> 99:59:59,000
out that the register values by default when you start a program and you haven't executed, theoretically, any opcode,
155
99:59:59,000 --> 99:59:59,000
theoretically, actually gives you some information that are actively used in malwares.
156
99:59:59,000 --> 99:59:59,000
So for example, at the start point, EAX gives you either gives you if it's older generation (XP or before),
157
99:59:59,000 --> 99:59:59,000
or Vista or later.
158
99:59:59,000 --> 99:59:59,000
This is not so used by malwares, I don't recall seeing it, but GS(?), if GS(?) is nil(?) then it's a 32-bit
159
99:59:59,000 --> 99:59:59,000
system, and if it's not it's a 64-bit system.
160
99:59:59,000 --> 99:59:59,000
I will actually use that later in one of the tricks.
161
99:59:59,000 --> 99:59:59,000
And also, the relations between the registers -- there are many registers on the Intel CPUs -- is not
162
99:59:59,000 --> 99:59:59,000
sometimes very clear. I was surprised that when you do a FP operation it changes the FPU status, the
163
99:59:59,000 --> 99:59:59,000
FPU registers themselves, but also the MMX registers, and somehow all the documentation I saw on the
164
99:59:59,000 --> 99:59:59,000
internet are always mapping ST0 and MM0 in front of each other which makes sense, but actually if you
165
99:59:59,000 --> 99:59:59,000
modify, if you just do a single FPU operation, it will actually modify MM0 but MM7.
166
99:59:59,000 --> 99:59:59,000
So if you do an FPU operation like "load (???)" and then you check the value of MM7, that could be used
167
99:59:59,000 --> 99:59:59,000
as a trick or it's just like the way it is.
168
99:59:59,000 --> 99:59:59,000
And like, all the documentation, wikipedia and so on, that I could find about the overlapping of the
169
99:59:59,000 --> 99:59:59,000
registers.
170
99:59:59,000 --> 99:59:59,000
Another thing is that this was used as an anti-emulation trick in XP. That FPU also changes CR zeroes(?)
171
99:59:59,000 --> 99:59:59,000
so you have quite an unexpected anti-emulation trick by just using FPU operation.
172
99:59:59,000 --> 99:59:59,000
So here is it; basically 'store machine status word' is an older 286 CPU opcode, or mnemonic, that was
173
99:59:59,000 --> 99:59:59,000
created at the 286 era, so before the protected mode was fully created, and so it allows you to access
174
99:59:59,000 --> 99:59:59,000
to read the value of CR0's, even from user mode, while the 'move CR0' is actually a priveleged opcode.
175
99:59:59,000 --> 99:59:59,000
For some reason, the higher word of the register is undefined officially by the documentation, so Intel
176
99:59:59,000 --> 99:59:59,000
just says "this is the value, the (???) value is correct but you cannot expect the real value". So for
177
99:59:59,000 --> 99:59:59,000
some reason, I don't know why they say that, because it's actually the value, the higher bits, of CR0's.
178
99:59:59,000 --> 99:59:59,000
And under XP, when you do FPU operations, the value of CR0's will be modified, and eventually reverts
179
99:59:59,000 --> 99:59:59,000
by itself. So you can have just by doing -- 'store machine status word', you expect the result, then
180
99:59:59,000 --> 99:59:59,000
you do FP operation, then the result should be different, and then eventually the result will revert
181
99:59:59,000 --> 99:59:59,000
to the original value. So it's quite the tricky and unexpected anti-emulator.
182
99:59:59,000 --> 99:59:59,000
You have a similar trick on 32-bit Windows, where GS is not stored in the context, so it means that on
183
99:59:59,000 --> 99:59:59,000
thread-switch the value of GS is lost, which means if you just wait for something, GS will eventually
184
99:59:59,000 --> 99:59:59,000
reset to 0. So if you set GS and you are stepping manually, this is slow and this creates a thread-switch,
185
99:59:59,000 --> 99:59:59,000
so incidentally GS is lost. And also, like the previous trick, if you just wait for GS not to be...
186
99:59:59,000 --> 99:59:59,000
if you just look until GS is not 0, this on a real system, will eventually exit from the loop.
187
99:59:59,000 --> 99:59:59,000
But the first time, it blew me, I was really wondering what can happen there, there's not (???). And
188
99:59:59,000 --> 99:59:59,000
of course in my proof of concept, it directly starts like this. What happens? What should happen now ,
189
99:59:59,000 --> 99:59:59,000
but on a real system? Eventually, it's reset to 0.
190
99:59:59,000 --> 99:59:59,000
Another thing is that of course it's reset to 0, but not in 0 time, so if you do wait for GS's reset
191
99:59:59,000 --> 99:59:59,000
and then another loop, this can only happen between two reset... thread switch, which means it should
192
99:59:59,000 --> 99:59:59,000
take a minimum of time, so you can use that for timing -- anti-emulation timing tricks.
193
99:59:59,000 --> 99:59:59,000
Of course, I was also thinking that noop is perfect, because noop is noop, it does nothing.
194
99:59:59,000 --> 99:59:59,000
But originally noop is 'exchange EAX with AX', or 'AX with EAX', but the problem is that noop (???)
195
99:59:59,000 --> 99:59:59,000
but on 64-bit you always have, you have another encoding to do an 'exchange EAX AX' which this time again