Author Topic: I can read dll  (Read 22298 times)

FatherXmas

  • Elite Boss
  • *****
  • Posts: 1,646
  • You think the holidays are bad for you ...
Re: I can read dll
« Reply #40 on: May 25, 2013, 03:58:49 PM »
It strikes me that Joshex may be someone brought up on scripting languages that aren't compiled at all and therefore always readable or ones that are partially compiled into p-code (Java) with final JIT native compilation done at runtime but the p-code can be used to get back to the original source.

It can be a very different PoV than those of us brought up on languages that are compiled and not for map file, the source code and a proper debugger, "unblending the frog" wouldn't be possible.
Tempus unum hominem manet

Twitter - AtomicSamuraiRobot@NukeSamuraiBot

Joshex

  • [citation needed]
  • Elite Boss
  • *****
  • Posts: 1,027
    • my talk page
Re: I can read dll
« Reply #41 on: May 25, 2013, 06:21:29 PM »
It strikes me that Joshex may be someone brought up on scripting languages that aren't compiled at all and therefore always readable or ones that are partially compiled into p-code (Java) with final JIT native compilation done at runtime but the p-code can be used to get back to the original source.

It can be a very different PoV than those of us brought up on languages that are compiled and not for map file, the source code and a proper debugger, "unblending the frog" wouldn't be possible.

precisely, my father taught me binary and hex, but the first ever programming I did was with q-basic I did little more than draw a picture on the screen pixel by pixel using "PRINT"

now I focus on Python object oriented code, it is heavily simplistic much like actionscript, yeah from time to time I do use xor and or ETC. but it is indeed a simplistic programming language.

and I'll say this, most game developers now days use something similar, even in CoH's time from what I'm seeing in the cleint file they relied heavily on defining thier own programming terms. because it's just quicker to write scripts that way.

hexadecimal versus binary, I say 'convert' because I used to make game enhancement codes for Game Genie (SNES) they required you to learn that Hexadecimal is actually just another number system, and it's true just instead of 11 it's B. technically in order to read hexadecimal as binary you would have to "convert" it to the propper number system, I suppose it's more of a translation than a conversion though,
« Last Edit: May 25, 2013, 06:42:14 PM by Joshex »
There is always another way. But it might not work exactly like you may desire.

A wise old rabbit once told me "Never give-up!, Trust your instincts!" granted the advice at the time led me on a tripped-out voyage out of an asteroid belt, but hey it was more impressive than a bunch of rocks and space monkies.

FatherXmas

  • Elite Boss
  • *****
  • Posts: 1,646
  • You think the holidays are bad for you ...
Re: I can read dll
« Reply #42 on: May 25, 2013, 07:34:50 PM »
Except that Cryptic, Paragon, ArenaNet, Carbine and just about every console or PC/Mac game developer are looking for C and C++ (Objective C for Mac) programmers and not Java, Ruby or Python programmers.  While it's true that for mission creation and the like are scripted, the underlying code that runs that script is still compiled before hand from a proper high level language. 

I'm not talking browser based MMOs since they tend to be written in Java or Javascript/WebGL.
Tempus unum hominem manet

Twitter - AtomicSamuraiRobot@NukeSamuraiBot

ROBOKiTTY

  • Boss
  • ****
  • Posts: 183
  • KiTTYRiffic
    • KiTTYLand
Re: I can read dll
« Reply #43 on: May 25, 2013, 07:58:52 PM »
CoH was written in C, so you can't meaningfully decompile that. On the plus side, disassembly gets you close enough to the original thing... not that it's any help without a lot of time and experience with asm.
Have you played with a KiTTY today?

Codewalker

  • Hero of the City
  • Titan Network Admin
  • Elite Boss
  • *****
  • Posts: 2,740
  • Moar Dots!
Re: I can read dll
« Reply #44 on: May 25, 2013, 08:08:52 PM »
COH was definitely written in C -- Microsoft Visual C 2005 to be exact (its compiler fingerprints are all over the place). That does make the disassembly a little easier to deal with.

Part of the authentication protocol uses a small but important portion written in C++ (it turns out that it's a third-party library linked in). That was a bit of a pain to untangle due to its excessive use of templates and virtual functions. Not impossible, just really annoying.

The frog in a blender is a good example -- you can't get the source code back from a compiled program like you can with obfuscated scripts. The best you can do is generate something that compiles to the same machine code, but you'll be missing important things like what the variables and functions are named, and complex constructs will often be broken down into something simpler.

The ability for entire games to be written in something like JavaScript and run in a canned engine like Unity or Blender's game engine is a very recent phenomena. It's only due to PCs being so incredibly fast and powerful now that the extra bloat from using an interpreted language isn't as detrimental on performance.

srmalloy

  • Elite Boss
  • *****
  • Posts: 450
Re: I can read dll
« Reply #45 on: May 25, 2013, 11:39:36 PM »
Anyway, neither what I said nor the way I said it was very civil. What I meant is that reverse engineering requires a degree of competence in a fairly niche skill set.

If this had been dropped in my lap within the first, oh, decade after I left college, I'd be much better set up to work with it; as it is, I've spent way too much time doing datamining and other database work -- the last sixteen years working with an extremely niche programming language and database architecture -- that my to-the-metal programming and reverse-engineering skills have deteriorated. I can look at a specification for how something has to work, or see it in operation, and be able to see the logic for (at least one way) the code can be constructed, and what database structures it would need to interact with, but working backwards from binary to source that can be redeveloped isn't part of my skillset any more.

Zombie Man

  • Elite Boss
  • *****
  • Posts: 296
Re: I can read dll
« Reply #46 on: May 26, 2013, 12:52:48 AM »
Any sign of LUA that the Devs were drooling over that was implemented in the last Beta and about to go live that was going to let content designers to much more easily scripted stuff?

TonyV

  • Titan Staff
  • Elite Boss
  • ****
  • Posts: 2,175
    • Paragon Wiki
Re: I can read dll
« Reply #47 on: May 26, 2013, 06:15:16 AM »
This is just one function of many. The EXE contains nearly 23,000 of them. You sure you got those all figured out over the span of a few hours? 'Cause we have information on what they all do, and it took us more than just overnight to get the job done.

I've hinted at this several times, but didn't know exactly how public you guys wanted this knowledge to be.  But yeah, at this point, I don't think that disassembling the file formats, protocols or client source code is an issue.  At this point, it's a matter of reconstructing a server that obeys those protocols and responds with answers that the client accepts and understands.  Not a trivial challenge to be certain, but it's probably worth noting that a lot of the really tedious, time-consuming work has been completed.  (And not just in the six months since the game has shut down, either.  A lot of this work has been ongoing for literally years; as stated, it was the basis of applications like Sentinel and pulling source data for City of Data and Mids.)

This is one of the reasons I've been so optimistic and insistent that we will have City of Heroes in some form back at some point.

The Fifth Horseman

  • Elite Boss
  • *****
  • Posts: 961
  • Outside known realities.
Re: I can read dll
« Reply #48 on: May 26, 2013, 09:19:11 AM »
Any sign of LUA that the Devs were drooling over that was implemented in the last Beta and about to go live that was going to let content designers to much more easily scripted stuff?
Correct me if I'm wrong, but wasn't that supposed to be used for controlling things server-side rather than client-side?
We were heroes. We were villains. At the end of the world we all fought as one. It's what we did that defines us.
The end occurred pretty much as we predicted: all servers redlining until midnight... and then no servers to go around.

Somewhere beyond time and space, if you look hard you might find a flash of silver trailing crimson: a lone lost Spartan on his way home.

Kyriani

  • Elite Boss
  • *****
  • Posts: 299
Re: I can read dll
« Reply #49 on: May 26, 2013, 12:11:22 PM »
This is one of the reasons I've been so optimistic and insistent that we will have City of Heroes in some form back at some point.

Can I just say your confidence here fills me with hope that it won't be too long before I am flying through paragon once more!

Joshex

  • [citation needed]
  • Elite Boss
  • *****
  • Posts: 1,027
    • my talk page
Re: I can read dll
« Reply #50 on: May 26, 2013, 12:45:02 PM »
COH was definitely written in C -- Microsoft Visual C 2005 to be exact (its compiler fingerprints are all over the place). That does make the disassembly a little easier to deal with.

Part of the authentication protocol uses a small but important portion written in C++ (it turns out that it's a third-party library linked in). That was a bit of a pain to untangle due to its excessive use of templates and virtual functions. Not impossible, just really annoying.

The frog in a blender is a good example -- you can't get the source code back from a compiled program like you can with obfuscated scripts. The best you can do is generate something that compiles to the same machine code, but you'll be missing important things like what the variables and functions are named, and complex constructs will often be broken down into something simpler.

The ability for entire games to be written in something like JavaScript and run in a canned engine like Unity or Blender's game engine is a very recent phenomena. It's only due to PCs being so incredibly fast and powerful now that the extra bloat from using an interpreted language isn't as detrimental on performance.

I really need to learn to type better, I thought I did reference that I noticed there was tons of .c scripts in the client?

fact is they are just scripts, not different than me adding a .py into a game they contain a fuction that cannot be completed with simple preconstructed programming terms like for example GetDamage or such, I do know that C++ is actually very similar to python, I've had people tell me that, then I've had other people tell em it's similar to flash actionscript. I suppose python contains BOTH manners or proramming both simple and complex.
There is always another way. But it might not work exactly like you may desire.

A wise old rabbit once told me "Never give-up!, Trust your instincts!" granted the advice at the time led me on a tripped-out voyage out of an asteroid belt, but hey it was more impressive than a bunch of rocks and space monkies.

ROBOKiTTY

  • Boss
  • ****
  • Posts: 183
  • KiTTYRiffic
    • KiTTYLand
Re: I can read dll
« Reply #51 on: May 26, 2013, 02:01:24 PM »
If C++ is similar to Python/ActionScript, it's because Python/ActionScript are heavily influenced by C++. But both abstract away the needless complexities of C++.

C/C++ code is compiled to object code, which is nothing like the intermediate bytecode that Python and ActionScript are compiled to. It's simple enough to decompile bytecode to a form structurally identical with the original, but reconstructing source code from object code is something else entirely.
Have you played with a KiTTY today?

GuyPerfect

  • Mary Poppins
  • Titan Staff
  • Elite Boss
  • ****
  • Posts: 1,740
Re: I can read dll
« Reply #52 on: May 26, 2013, 02:49:27 PM »
I've hinted at this several times, but didn't know exactly how public you guys wanted this knowledge to be.  But yeah, at this point, I don't think that disassembling the file formats, protocols or client source code is an issue.

I don't mind people knowing we've hacked the EXE. By now, I figured it was obvious. It's just good, clean fun! The things that shouldn't be public knowledge are the ones that really don't need to be public...

You know the ones I mean.



[...] I noticed there was tons of .c scripts in the client?

fact is they are just scripts, not different than me adding a .py into a game [...]

Everywhere I look...

It's simple enough to decompile bytecode to a form structurally identical with the original, but reconstructing source code from object code is something else entirely.

That's not a given. In its most rudimentary form, bytecode is simply instructions that generally do not represent the actual machine code of a CPU architecture.

The reason Java and .NET bytecode can be pulled apart so easily is because they're designed to be useful for debugging: when a program bombs, you get a nice detailed report of the problem, including the line numbers where things went awry. That debugging information is what's useful for figuring out what the original probably looked like.

The Fifth Horseman

  • Elite Boss
  • *****
  • Posts: 961
  • Outside known realities.
Re: I can read dll
« Reply #53 on: May 26, 2013, 07:29:25 PM »
Everywhere I look...
At least he's trying. I've seen people with BSc in CS who can't write a for loop to save their lives (and I'd love to say that's an exaggeration... except one of them admitted as much lately).
We were heroes. We were villains. At the end of the world we all fought as one. It's what we did that defines us.
The end occurred pretty much as we predicted: all servers redlining until midnight... and then no servers to go around.

Somewhere beyond time and space, if you look hard you might find a flash of silver trailing crimson: a lone lost Spartan on his way home.

Whiteseeker

  • Underling
  • *
  • Posts: 9
Re: I can read dll
« Reply #54 on: May 26, 2013, 08:45:21 PM »
All you smarter than me people keep doing what you're doing! I want my coh back =\

MAN!!!! This this this.

As you could see in my profile, Been on "this site" since I believe 2008, and only 6 posts so far. Amazing huh...I like to just read, but am getting very antsy lately from months of CoX withdraw. Please tell me if you guys have it so it can be played at least on just my compy! I wont tell, I swear!
I had ICON running a bit ago and my wife ran in and started to cry cause she thought for a moment that CoX was back up and running. Man did I feel bad.
« Last Edit: May 26, 2013, 09:04:35 PM by Whiteseeker »
CoH player since beginning.

Joshex

  • [citation needed]
  • Elite Boss
  • *****
  • Posts: 1,027
    • my talk page
Re: I can read dll
« Reply #55 on: May 26, 2013, 09:22:26 PM »
« Last Edit: May 27, 2013, 02:03:33 AM by Joshex »
There is always another way. But it might not work exactly like you may desire.

A wise old rabbit once told me "Never give-up!, Trust your instincts!" granted the advice at the time led me on a tripped-out voyage out of an asteroid belt, but hey it was more impressive than a bunch of rocks and space monkies.

Codewalker

  • Hero of the City
  • Titan Network Admin
  • Elite Boss
  • *****
  • Posts: 2,740
  • Moar Dots!
Re: I can read dll
« Reply #56 on: May 27, 2013, 05:20:57 PM »
I suppose my whole point with this thread at this point in the conversation is to say that, I know enough about game design to notice things when I read through these documents, styles, the way thier code is set-up and how it arbitrarily links to .c scripts and .bounds files.

You're seeing references to .c source files only because a programmer put them there on purpose. They're part of the string table, so you have to find where they're referenced from. In a C program, the compiler will lump all of the strings (text) used by the program together, often eliminate duplicates, and then the actual compiled code will reference those strings.

Here's an example from a relatively simple function that is called when you click a choice in a reward table popup:

Code: [Select]
CPU Disasm
Address   Hex dump          Command                                  Comments
005E2970  /$  51            PUSH ECX
005E2971  |.  A1 E05D6801   MOV EAX,DWORD PTR DS:[1685DE0]
005E2976  |.  56            PUSH ESI
005E2977  |.  8B35 DC3BB900 MOV ESI,DWORD PTR DS:[0B93BDC]
005E297D  |.  57            PUSH EDI
005E297E  |.  6A 5F         PUSH 5F                                  ; arg2 = 95
005E2980  |.  6A 01         PUSH 1                                   ; arg1 = 1
005E2982  |.  68 3F0A0000   PUSH 0A3F                                ; line=2623
005E2987  |.  68 200AAB00   PUSH OFFSET 00AB0A20                     ; ASCII "C:\buildfarm\slave\full_release\release\2400\code\coh\Game\UI\uiNet.c"
005E298C  |.  8BF8          MOV EDI,EAX
005E298E  |.  E8 BD992700   CALL 0085C350                            ; send_pack_bits_debug
005E2993  |.  56            PUSH ESI                                 ; arg2 = [0B93BDC]
005E2994  |.  6A 03         PUSH 3                                   ; arg1 = 3
005E2996  |.  68 400A0000   PUSH 0A40                                ; line = 2624
005E299B  |.  68 200AAB00   PUSH OFFSET 00AB0A20                     ; ASCII "C:\buildfarm\slave\full_release\release\2400\code\coh\Game\UI\uiNet.c"
005E29A0  |.  8BC7          MOV EAX,EDI
005E29A2  |.  E8 A9992700   CALL 0085C350                            ; send_pack_bits_debug
005E29A7  |.  83C4 20       ADD ESP,20
005E29AA  |.  5F            POP EDI
005E29AB  |.  5E            POP ESI
005E29AC  |.  59            POP ECX
005E29AD  \.  C3            RETN

The parts after the semicolons are comments added by me to clarify what some of the lines do for reference. The function name "send_pack_bits_debug" doesn't actually exist in the EXE, it's just something I made up after figuring out what the function at that address does.

As you can see, the machine code is using that string to add a file and line number parameter to certain functions for debugging purposes. The only reason it's there is so that the program can, at run-time, report which source file a particular call came from if there's a problem. Not all calls have this (if they did it would be much easier to reconstruct what everything does), but a few do. Probably ones that the debug code never got removed before releasing it.

Quote
I know you've opened the file in a hex editor as I can clearly see, however in a hex editor the entire document will be squigglies (special characters) which are used to represent something else whether its short terms such as AND and XOR or long terms and mathematical calcuations such as get.player.info.object.location() +90.87.23 and movement.speed = 3.4 (just an example it's not actually in the document)

The "squigglies" are X86 machine code; the lowest level of program instructions that are directly executed by the CPU. Get a disassembler and it will turn it into assembly language like I posted above, which is a bit more readable.

Also, .bounds files are just bounding boxes for various bits of geometry. Not too exciting by themselves, though somewhat useful if you were building a map editor or something.

Quote
but the fact of the matter is, Wordpad (based on the codecs and programming libraries you have installed) will actually automatically convert some of this text into a legible format not comprised of special characters.

The only thing wordpad might do for you is decode UTF-8, but the COH client doesn't have any of that embedded in the exe, and the French and German message files are long gone. If you just want to look at string tables, you'd be better off grabbing a copy of the strings utility (link goes to a Windows port, pretty much all UNIX systems come with it already). It'll find and show just the ASCII, or Unicode text if you specify, filtering out all the binary portions.

Opening a pigg directly won't get you really anything except a list of filenames from the directory table. The files contained within are compressed -- you need a tool like PiggViewer to extract them.

Quote
as I said I'm willing the bet that the server responds to the client with one of those predefined expressions, some of them even have brackets, the brakets are either empty or labeled as Int (integer) or  Bool (true or false switching operation)

The client/server protocol is binary. I'm using the colloquial form of binary to mean "not plain text", not literally using 1s and 0s to represent it. Most often binary data is viewed in hex for convenience.

Quote
Code: [Select]
@@CryptoPP@@V?$RSAPrivateKeyTemplate@V?$DecryptorTemplate@V?$OAEP@VSHA@CryptoPP@@V heres what codewalker was talking about in another thread, every time the client connects to the server; the server tells the client "heres how I'm gonna encrypt the packets I send to you"  well ok then the server is using the CryptoPP method the multi listing of cryptopp tells me that this is some type of exteriorly developed encryption syntax. gettign ahold of a copy of CryptoPP would allow us to make our server have the same encryption of it's packets as CoH did. and also decrypt any packets that the client sends ;)

Except you're jumping to conclusions based on seeing some text in a string table instead of actually analyzing where it's used. CryptoPP is the C++ code I was talking about tracing through earlier. However, it's used only during initial login, to talk to the authentication server. The game protocol is encrypted using a different library -- a modified version of the original C reference implementation of Blowfish.
« Last Edit: May 27, 2013, 05:39:44 PM by Codewalker »

Codewalker

  • Hero of the City
  • Titan Network Admin
  • Elite Boss
  • *****
  • Posts: 2,740
  • Moar Dots!
Re: I can read dll
« Reply #57 on: May 27, 2013, 05:38:19 PM »
more stuff,
Code: [Select]
PromptTeamTeleport it is obvious what would happen if a server file sent this to the client.

Except that the server never actually sends that to a client. The server sends a numerically coded message. The string "PromptTeamTeleport" exists in the client because it's a key that is used to look up a localized message from clientmessages-en.bin to show in the options menu:

Code: [Select]
$ pstring bin/clientmessages-en.bin PromptTeamTeleport
PromptTeamTeleport: Prompt Team Teleport

Quote
Code: [Select]
HideSearch
HideSG
HideFriends
HideGFriends
HideGChannels
HideTells
HideInvites
messages sent to the server from the client and saved to the character file (and from the server to the client when loading saved settings). obviously these deal with the hide menu in chat.

These are also all strings that are displayed in the options menu. They have nothing to do with client/server comms.

Actually pretty much everything you've posted comes from the UI system, and are references to messages that are displayed on-screen to the user.

Quote
Code: [Select]
----------------------------------------------------------------------
------- BEGIN PHYSICS STEPS ------------------------------------------
----------------------------------------------------------------------


------ %3d -----------------------------------------------------------------
    ** FLYING **
    ** NO ENT COLL **
ControlsInputIgnored, Jumping,
%4d. [%d/%dx]: id=%5d, cur_time=%dms, runTime=%dms (%dms)
        keys:      %s
        pos:       (%1.8f, %1.8f, %1.8f)
      + vel:       (%1.8f, %1.8f, %1.8f)
      + inpvel:    (%1.8f, %1.8f, %1.8f) @ %f
      + pyr:       (%1.8f, %1.8f, %1.8f)
      + misc:      grav=%1.3f, %s%s
      + move_time: %1.3f
      = newpos:    (%1.8f, %1.8f, %1.8f)
        newvel:    (%1.8f, %1.8f, %1.8f)
%s%s
----------------------------------------------------------------------
-------- END PHYSICS STEPS -------------------------------------------
----------------------------------------------------------------------

You've found some more strings used by debug code. The COH client has a lot of debugging stuff left over in it. The above chunk is filled with values and spit out on the console (run the client with -console on the command line to see it) every time you move if you use the /controldebug slash command, which normally requires client access level 9. There are ways around that, and nemerle has been using the output from it to figure out how some of the clientside physics works in order to implement something similar in SEGS. I don't know if his posts on the subject survived their forum crash.

Quote
how to tell the client to load steel canyon:

Code: [Select]
sceneLoad, SteelCanyon you might also need
Code: [Select]
finishLoadMap or loadMap

Nope, the map loading code is buried in the guts of the network receive path. Fortunately there's a copy of a subset of it in the demo playback code. How Icon loads Steel Canyon:

CALL 0053AAD0            ; clears out old map
MOV EAX, OFFSET "maps/City_Zones/City_02_01/City_02_01.txt"
CALL 00534160            ; loads the map (fastcall using EAX)
« Last Edit: May 27, 2013, 11:13:30 PM by Codewalker »

Kyriani

  • Elite Boss
  • *****
  • Posts: 299
Re: I can read dll
« Reply #58 on: May 28, 2013, 03:15:49 AM »
Way to crush my hopes and dreams Codewalker =\

Here I was hoping someone found some way to automagically bring back my beloved COH ;_;

I demand that you use your coding powers to bring it back as recompense for destroying my hopes and dreams!

(this was all tongue in cheek but please do something if you can!)

Taceus Jiwede

  • Time Traveler
  • Elite Boss
  • *****
  • Posts: 978
Re: I can read dll
« Reply #59 on: May 28, 2013, 07:37:03 AM »
Codewalker that was really interesting and I feel like I learned something.  But now my head hurts.