Author Topic: Technical side discussion  (Read 164620 times)

Codewalker

  • Hero of the City
  • Titan Network Admin
  • Elite Boss
  • *****
  • Posts: 2,740
  • Moar Dots!
Re: Technical side discussion
« Reply #400 on: November 08, 2017, 10:51:24 PM »
Code: [Select]
    Beacon =
        -------- Beacon 0 --------
        u080 = BasicBeacon
        u081 = 35.6909
I have no idea what that u081 number represents.

That means it's the 81st field in the file and that I didn't know what it was. When I was building the library that powers bindump, I numbered the fields u001, u002, u003, etc., the 'u' meaning "Unknown". Then as I figured out what each field was, the name was changed. Only the fields that didn't have a text name I could extract from the exe and didn't have an obvious purpose retained the 'unknown' designation.

Now I know that u080 and u081 are Name and Radius, respectively. Radius for a beacon depends on the context of what the beacon is used for, but generally just indicates the proximity in which the beacon is relevant.

Individual basic beacons are sometimes referenced multiple times at different absolute positions, so it seems that those can't actually be "absolute" positions for any given beacon.

There are no absolute coordinates in groupdefs, only relative to the parent group. The only place you'll find absolute coordinates in a map file is in the Refs section at the very end where groups are instanced into the scene graph.

How the paths run between beacons isn't immediately obvious.

They don't. Beacons are just location markers.

Any hints on how to make sense of the paths is appreciated.

If you're looking for an AI path network, you won't find one. NPC movement was all serverside, and whatever pathfinding info that it used does not exist in the client files.

Most beacons are for things like doors (the rationale for this is a little complicated, but door properties are assigned to beacons that were placed near the door rather than the door itself), navigation points for the minimap, special markers for things like a minimap swap when going indoors, script markers, etc.

Basically beacons are things that the devs manually placed on a map. AI pathfinding was most likely done through automatic analysis of the geometry, perhaps using some of the manual beacons as hints in tricky places.

The exception there is that a few maps have waypoint beacons for patrol routes where enemy groups followed a fixed path. There are also traffic beacons for cars (though most of these are part of the library pieces so you won't see them in bindump, but you can see them in Icon). Both of those are close enough together that they could probably be joined into a path network. However they would only work in certain places that those beacons exist, and wouldn't be generally useful for a bot to find its way around.

slickriptide

  • Elite Boss
  • *****
  • Posts: 356
Re: Technical side discussion
« Reply #401 on: November 09, 2017, 12:04:31 AM »
I guess that saves me worrying about parsing the map file. If I want a bot(s) to move around a map I'll have to map it out myself ahead of time and create the paths for them to move along. NPC's on rails, it is!






slickriptide

  • Elite Boss
  • *****
  • Posts: 356
Re: Technical side discussion
« Reply #402 on: November 10, 2017, 11:43:45 PM »
So, today I've been exploring Icon with /map_dev and /mission_control turned on.

Neuromancer. Woah!




I'm revising my opinion about parsing the map files. There may not be a specific AI pathing mechanism in the maps but there are definitely some paths built into the maps and there are lots of markers that can be abused as path markers, especially in the main city zones.

For instance:



The big arrows in the street are directing traffic. The green upside-down tetrahedrons are marking an obstacle-free path that a running NPC can follow. The thin green arrows that are hard to see in that pic are marking a path for pedestrians to follow and the purple thing with the number on marks where the pedestrians change direction.



Here you can see the path up the stairs laid out in steps that prevent a NPC from accidentally sinking into the stairs.



Here the tetrahedron's light the way around the sidewalk that runs around City Hall.

Even in places where there aren't any paths, per se, there are LOTS of encounter spawn points and those spawn points can be abused repurposed into being "beacons" that can lead a bot past most nearby obstacles.

So, it seems that if I want to have some kind of general way of navigating any map in the game, that I'm going to have to learn to read map files, after all. I'm also going to have to come up with some quick way to search for "beacons" that are not too close, and not too far, and to make that "best beacon to aim for" computation in a reasonably short amount of time.

That's not too much to ask Rover to do, right?  :o



Codewalker

  • Hero of the City
  • Titan Network Admin
  • Elite Boss
  • *****
  • Posts: 2,740
  • Moar Dots!
Re: Technical side discussion
« Reply #403 on: November 11, 2017, 04:02:32 AM »
I'm revising my opinion about parsing the map files. There may not be a specific AI pathing mechanism in the maps but there are definitely some paths built into the maps and there are lots of markers that can be abused as path markers, especially in the main city zones.

Just to give you a heads-up about what you're in for, to get everything that you see in Icon you'll have to parse a lot more than just the map files.

Map files in COH are made up of groups, and references to those groups, forming a tree (graph). But some groups in maps refer to things that aren't in the map file. They refer to groups defined in what's called the "object library", which is a repository of prebuilt objects. Things like buildings, or even chunks of city blocks that map designers could place.

Some beacons are defined in the map itself, but others are part of library pieces. Probably the best example of this is those big green arrows on the roads. Most of those do not actually exist in the map file. Instead, there is a library piece that's a chunk of road AND the beacons. So when designers built the roads by placing those chunks, the beacons are there by default.

The spawn clusters are another example. Those are typically only 1 point in the map (and do not look like a beacon), but it's actually a reference to the entire group of all the spawn points.

That means if you want a FULL scene graph, you have to chase references all the way into the object library. At a minimum, that means loading defnames.bin, building an index, then loading the appropriate library file (fortunately these are the same geobin format as the maps) when a ref to one of the library pieces mentioned there is found in a map.

slickriptide

  • Elite Boss
  • *****
  • Posts: 356
Re: Technical side discussion
« Reply #404 on: November 14, 2017, 08:43:15 PM »
Hey, Arcana! Any chance you've got a function laying around that returns a list of every pre-defined map coordinate in a given map file that's within X world-units of the "current position"? No? Well, it never hurts to ask, heh.

So, thinking out loud, and basing on the "beacon" situation (where I'm using "beacon" to mean any map entry at all that can be used as a reference point for a bot, rather than meaning the things inside the map file that are explicitly labeled as beacons)  in a city map (Atlas Park) and a hybrid city/outdoor map (Croatoa) here's what I'm thinking I need to do:

Dump the scene graph into a SQL database of reference points. These would be the computed absolute positions of each point.

Label each reference point as a "beacon" or an "obstacle". For now, I'm not sutzing over the collision box of the obstacle but if the label of the obstacle is helpful enough to include a measurement in its name then I might save that data as well. Probably I'll want to label each reference point with a type that indicates the originally intended purpose of the reference point, to be used to "weight" actual Omni/NAV reference points higher than others.

Is it useful to know the facing (yaw) of obstacles? Maybe. Walls, certainly; not so much other kinds of obstacles. Hmmm... I think I might need to know the size of the collision boxes after all. How hard is it to pull that information out of the .geo files?

A tick of movement conceptually would go like this:

Compute a "movement unit vector" that is the unit vector of destination - current_position.
Compute a "tick vector" that is Ux,Uy,Uz + Vx,Vy,Vz
Select a list of all NAV points that lie along the "tick vector".
For each NAV point, compute a seek steering force that moves toward that NAV point. The result is a "tick vector" that lies between all nearby NAV points.
If there are no nearby NAV points (that is, map objects named "Omni/NAV") run the check again for non-NAV "beacons" - spawn points, mainly, but possibly other kinds of ref points as well.
Select a list of all obstacles that lie along the current "tick vector". These are primarily objects like bushes, trees, rocks, walls, fences, and buildings.
For each obstacle along the "tick vector", compute an avoidance steering force. This is where collision boxes might be necessary, to modify the magnitude of the steering force in order to clear the obstacle. The result should be a "tick vector" that follows an "optimal" path between all nearby reference points.

In an ideal world, this means that NAV points (or any point being utilized as a "movement beacon") are "pulling" Rover towards themselves while obstacles are "pushing" Rover away from themselves.

This still doesn't entirely solve the problem of sloping terrain but the ref points in the map files are mostly at ground level already and Rover can include some filters that drop ref points that are clearly set at a higher delta-Y than, say, the height of a small hill or a flight of stairs. I think we can pretty much ignore the idea of Rover ever navigating a nightmare like The Hollows.


« Last Edit: November 14, 2017, 08:50:30 PM by slickriptide »

Arcana

  • Sultaness of Stats
  • Elite Boss
  • *****
  • Posts: 3,672
Re: Technical side discussion
« Reply #405 on: November 15, 2017, 04:08:15 AM »
Hey, Arcana! Any chance you've got a function laying around that returns a list of every pre-defined map coordinate in a given map file that's within X world-units of the "current position"? No?

No.

It is fair to say that I got very far in parsing files related to powers.  I got pretty far in parsing files related to animations.  I got a headache trying to parse geometry, and it wasn't helpful for the parts of the game I was actively contributing my expertise to.

Geometry is not intrinsically hard.  In fact, I can recognize what is going on in broad strokes, and I'm familiar enough with the principles of 3D computer geometry to not be totally lost.  But it is a Matryoshka doll of things within things within things that would be extremely time consuming to untangle.

Quote
So, thinking out loud, and basing on the "beacon" situation (where I'm using "beacon" to mean any map entry at all that can be used as a reference point for a bot, rather than meaning the things inside the map file that are explicitly labeled as beacons)  in a city map (Atlas Park) and a hybrid city/outdoor map (Croatoa) here's what I'm thinking I need to do:

I see two solutions to your problem, you can parse geometry, or you can cheat.

Parsing geometry involves creating a space-map from the geometry and object files.  Straight forward in principle, extremely tedious in detail.

Cheating involves predetermining where you want your bot to be able to travel, and just walking the terrain in Icon or something and demorecord your pathing.  Then feed that data to the bot and program the bot to interpolate your own coordinates rather than attempt to figure out where the ground and obstacles are.  You would be in effect tracing the ground with your feet and recording where the ground is using a demorecord.

If you're going with option one, I would suggest ignoring all of the detail and just starting with the ground.  You're a ghost bot, and will walk through walls and assorted shrubbery.  But just getting the ground right is a huge step forward.

Plus, I don't think City of Heroes itself consistently got this right either.  Some of the moving background objects did do crazy things like walk up nonexistent stairs or pass through mailboxes.  Collision detection is expensive and I don't think the game itself did it perfectly all the time for every object.  Sometimes it just did what it thought it was supposed to do, which wasn't quite right in the actual game world.

slickriptide

  • Elite Boss
  • *****
  • Posts: 356
Re: Technical side discussion
« Reply #406 on: November 15, 2017, 03:02:02 PM »
For city zones, the list of Omni/NAV beacons is a pre-existing "cheat sheet" where the bread crumbs track the paths around all of the sidewalks and the absolute position of each beacon is easily computed. I've hand-computed a few and double-checked the results in Icon.

I suppose for starters what I should do is just load up Atlas Park's nav beacons and instruct Rover to execute a loop where he perpetually runs from one beacon to another and see where he goes. Connect the dots, in essence. That would be pretty straightforward.

If I want to layout a pre-determined path it's even easier than using Icon - I tell Rover to heel, then I walk the path in Paragon Chat and at every "checkpoint" give Rover a command to record his current position. Voila!

That sounds like a plan for "next steps".

  • Create a database of sidewalk beacons for Atlas Park
  • Teach Rover to "move forward to the next nearest beacon"
  • Define a set of "path beacons" that define a complete hard-coded path.
  • Teach Rover to create, edit, and delete paths.

That will be good enough movement functionality to move on to creating a simple scripting engine for Rover.
« Last Edit: November 15, 2017, 03:42:28 PM by slickriptide »

Codewalker

  • Hero of the City
  • Titan Network Admin
  • Elite Boss
  • *****
  • Posts: 2,740
  • Moar Dots!
Re: Technical side discussion
« Reply #407 on: November 15, 2017, 05:00:18 PM »
If you want just a very rough idea of the terrain and are willing to build the whole scene graph (including the object library), one thing you could do is load the .bounds files instead of trying to parse geometry.

Those are in Parse6 (bin) format as well and contain the bounding boxes for library objects.

The min and max corners of the bounding box are u001 and u004 in bindump, respectively. Other interesting fields are u011: a (calculated) spherical radius, which might be useful if you're using an avoidance algorithm rather than collision, and u015. u015 is the flags bitfield, the interesting ones there are bit 25 (0x2000000), which indicates an object that is invisible except in developer mode and should be ignored, and bit 23 (0x800000) which indicates an entire object set to not collide.

As every map is made out of pieces from the library, you could build a rough collision model from the bounding boxes. For slanted surfaces it would work if the object was rotated by the map designer (you'd just rotate the box), but it wouldn't be able to handle it if the slant is built into the geometry itself. i.e. things like the tech ramps, or the block around city hall, where the slant is part of a larger object.

For that you'd have to open the can of worms of loading the geos (iirc there are 6 different 'versions' of the geo format that need to be supported), and then loading the tricks to get texture flags and figure out which surfaces are collidable, and so on...

slickriptide

  • Elite Boss
  • *****
  • Posts: 356
Re: Technical side discussion
« Reply #408 on: November 16, 2017, 12:01:11 AM »
Oh, bounds boxes in a readable bin format could makes things doable!

Going back to the whole "there'll never be an Issue 25" thing, the "rendering" of the scene graph only has to be done once and stored in a database. It's not something that Rover has to know how to do. All he needs to know is how to analyze the portions of the current map's scene graph that are nearby at any given moment.

Codewalker

  • Hero of the City
  • Titan Network Admin
  • Elite Boss
  • *****
  • Posts: 2,740
  • Moar Dots!
Re: Technical side discussion
« Reply #409 on: November 16, 2017, 07:49:35 PM »
Wellllll... given the Pocket D AE wing expansion and the (seasonal) mysterious floating island in Croatoa, I'm not sure that's the best assumption to make.

slickriptide

  • Elite Boss
  • *****
  • Posts: 356
Re: Technical side discussion
« Reply #410 on: November 17, 2017, 04:22:06 PM »
Wellllll... given the Pocket D AE wing expansion and the (seasonal) mysterious floating island in Croatoa, I'm not sure that's the best assumption to make.

Heh. Well, given that pre-rendering will work (assuming it works) for 99.9% of the maps in the game, I'm willing for now to deal with the risk that a map might have to be re-rendered once in a while.

I want to keep Rover's overhead as light-weight as possible, given that he's running in an interpreted environment and he already has a slow clock. (Though, I'm not certain he SHOULD have a slow clock - I might have to investigate whether that's correctable. Arcanabot was running in Python and happily doing 30fps.)

Then again, it's possible that I'm exaggerating the extent of the memory footprint of Rover carrying all of the path nodes around in his head. Hmm... I suppose I should at least experiment with that approach rather than dismiss it outright.


Golden Aurora

  • Boss
  • ****
  • Posts: 108
Re: Technical side discussion
« Reply #411 on: November 17, 2017, 08:10:05 PM »
Perhaps it should be based on how intensive (long) the process is for processing the maps.
If it's only a few seconds, that could plausibly be done once on bot load (and cached for some short duration).
If it's a few minutes, that probably should be done once and exported to a file for the bot to use.
Maybe make a util that updates the file after execution.

But either way, the code for processing all that has to be made first.
The best use might be a bit hard to discover until after the fact.

Just the thoughts off the top of my head. All this stuff really interests me.

slickriptide

  • Elite Boss
  • *****
  • Posts: 356
Re: Technical side discussion
« Reply #412 on: November 20, 2017, 03:39:18 PM »
Perhaps it should be based on how intensive (long) the process is for processing the maps.

It will take some experimenting to determine the best approach. I may have to table a lot of the testing until after the holidays.

slickriptide

  • Elite Boss
  • *****
  • Posts: 356
Re: Technical side discussion
« Reply #413 on: August 21, 2018, 03:16:59 PM »
I prefer to avoid re-inventing the wheel when a perfectly good wheel already exists, so I'm just going to ask - Is there an algorithm that illustrates how to apply the schemas to the game data files?

For instance - the geobin.xml schema shows the data structures, without acknowledging the file header and its various "magic" values or any mechanism for determining the length of the dynamic arrays in the structures. It sort of looks like you process a def by loading a four-byte value for an <entry/> and if the value is non-zero then you treat it as the beginning of an array element and you proceed to load array elements until you load a zero-filled element; or at least one where the first four bytes are zero-filled. It kind of looks like there are some filler bytes in there, so the length of the array may be encoded and I'm just not seeing it yet.

So, I figure I can spend a couple of days futzing around with Kaitai or I can just ask the expert.

And, if this ends up boiling down to "maybe you should just parse the output of bindump because it understands ALL of the old formats and not just parse6" then the next question will be, "Can bindump have an option added to output xml?"

slickriptide

  • Elite Boss
  • *****
  • Posts: 356
Re: Technical side discussion
« Reply #414 on: August 23, 2018, 06:00:59 PM »
After some time spent with the geobin files and geobin.xml, I've concluded the following:

There's a magic header.

The data is stored "little endian".

An array of structs begins with a four-byte index showing the number of records. If the index is zero, the array is empty. Each individual struct in the array begins with a four-byte size followed by the data for that struct.

Strings, aside from the magic values in the file header, are stored as a two-byte length parameter, the string value, AND a null-char terminator (even though you don't strictly need the null terminator given that you know the length already).  This might mean an empty string would be three bytes: 2 for the length of zero, and another for the requisite terminator.

INT is a four-byte integer. U8 and F32 indicate the number of bits respectively. Enum values are U16, though there's no reason they couldn't be signed.

Unless I made an error in the above, that looks like enough info to read a geobin file (and probably any other file that Codewalker has provided a schema for). I'm going to feed the schema plus the various extra bits noted above into a Kaitai schema and see if it spits out a parser that produces the same info as bindump. If it does, then I've got my data extractor and the potential for the bot itself to read a map file if that's a useful thing to do.

***EDIT***

There might be some question as to how DEGREES is represented. I'd assumed it was F32 but if it's a series of bit-coded quaternions or something arcane like that then it might actually be something that needs to be examined closer. (That almost sounds like I actually understand what quaternions are, heh.)

***EDIT MORE***
I'm not sure what's going on with the Omni's. Sample dump:
Code: [Select]
A0 00 00 00 0D 00 67 72 70 6C 69 74 65 31 34 32
31 32 39 00 01 00 00 00 2C 00 00 00 0E 00 4F 6D
6E 69 2F 5F 4C 69 67 68 74 44 69 72 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 01 00 00 00 14 00 00 00 3A 00 00 00
3A 00 00 00 3A 00 00 00 6C E0 EE 41 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

The omni in question is from grplite142129 in City_01_01.bin.
I read this as:

"01 00 00 00" is one omni record.
"14 00 00 00" is a record size of 20 bytes.
"3A 00 00 00" is fixed array element 0 aka u060
"3A 00 00 00" is fixed array element 1 aka u061 (I can see this might not have been the best example to use)
"3A 00 00 00" is fixed array element 2 aka u062
"6C E0 EE 41" is float value aka u063
"00 00 00 00" following all that is presumably the flag enum, aka Flags

I've got some  issues here.
First, is that geobin.xml specs those array elements as U8 when they appear to actually be stored as U32.
Second, that "fixed" arrays don't seem to have a size element. Since they're "fixed",  it seems to be presumed that anyone reading it already knows its size.
Finally, even though Flags enumerations appear to be specced as U16, it must also be stored as U32 because otherwise the record size doesn't pan out.

I guess I need to take a close look at the raw file to see if all of the U8-specced values in the schema are actually stored as full-size U32 values in the data file.

« Last Edit: August 23, 2018, 07:00:44 PM by slickriptide »

Codewalker

  • Hero of the City
  • Titan Network Admin
  • Elite Boss
  • *****
  • Posts: 2,740
  • Moar Dots!
Re: Technical side discussion
« Reply #415 on: August 24, 2018, 02:40:27 AM »
I prefer to avoid re-inventing the wheel when a perfectly good wheel already exists, so I'm just going to ask - Is there an algorithm that illustrates how to apply the schemas to the game data files?

I don't think there's one. From time to time I've considered tossing bindump up on github or something, but tbh it's a huge pain in the rear to build. Have to find the right boost version for starters, and the python-based code generator for the schema readers needs some extra dependencies to run. It needs a little work just to compile on something like Linux and a LOT of work to compile on something like Windows.

There's a magic header.

The data is stored "little endian".

An array of structs begins with a four-byte index showing the number of records. If the index is zero, the array is empty. Each individual struct in the array begins with a four-byte size followed by the data for that struct.

Strings, aside from the magic values in the file header, are stored as a two-byte length parameter, the string value, AND a null-char terminator (even though you don't strictly need the null terminator given that you know the length already).  This might mean an empty string would be three bytes: 2 for the length of zero, and another for the requisite terminator.

Sounds about right.

There might be some question as to how DEGREES is represented. I'd assumed it was F32 but if it's a series of bit-coded quaternions or something arcane like that then it might actually be something that needs to be examined closer. (That almost sounds like I actually understand what quaternions are, heh.)

It's just an F32, but it's stored in radians. So multiply by 180?.

Quaternions are used in the network protocol over the wire, but not in any data files. Well, they're in .anim files, but nothing in Parse6 bin format.

First, is that geobin.xml specs those array elements as U8 when they appear to actually be stored as U32.

All integers in bin files are stored as 32-bit integers. The U8 and INT16 describe the *in-memory* structure and also give a hint as to the maximum range, but when serialized to bin files are always 32 bits.

Everything in a bin file should be 4-byte aligned. Even strings need to be padded if they don't end on a 4-byte boundary.

Second, that "fixed" arrays don't seem to have a size element. Since they're "fixed",  it seems to be presumed that anyone reading it already knows its size.

Yes, fixed arrays have a set number of elements. It should be in the XML file, right?

Codewalker

  • Hero of the City
  • Titan Network Admin
  • Elite Boss
  • *****
  • Posts: 2,740
  • Moar Dots!
Re: Technical side discussion
« Reply #416 on: August 24, 2018, 03:00:55 AM »
Bit of context here, the bin format is not the "true" format, but is the one we have easy access to. The xml files I posted are extracted and converted from binary tables in the client exe -- tables that are used by what's called the textparser.

The game is natively designed to load 95% of its data from a bunch of plain text files (hence textparser) on a developer's machine. This is a structured format that can vary a bit depending on the particular schema, but generally has a consistent set of rules. A good example is costume files. These are stored in the text format and the textparser is used to load and save them. Take a look at the costume.xml schema file while viewing a saved costume in a text editor.

Loading thousands of text files isn't particularly conducive to a published game (and is rather slow), so whenever a text file is loaded by textparser, it can optionally be persisted in a binary format -- the latest and last version of that being Parse6. The internal layout of these binary files didn't even stay the same from version to version; they're just temporary storage generated by textparser as a byproduct and were not the "main" data format -- but ARE what got published and the form that we have as a result.

So the binary tables the schema XML was built from actually describe 3 things simultaneously, in order of descending importance:

  • The format of the master text files for all of the game data.
  • The in-memory layout of structures that those text files get loaded into. This is especially handy for me because it makes hot-patching stuff in memory easier if this is known.
  • The format of the binary files that the text data gets persisted into.

As a result of this, the textparser is also used to load bin files directly when the text files aren't available (or are not any newer than the cached bins). So it may sound like a misnomer when I talk about the textparser tables being used for loading bin files, but it actually makes sense in the full context.

Codewalker

  • Hero of the City
  • Titan Network Admin
  • Elite Boss
  • *****
  • Posts: 2,740
  • Moar Dots!
Re: Technical side discussion
« Reply #417 on: August 24, 2018, 03:18:07 AM »
The "magic" header:

8 bytes: "CrypticS"
4 bytes: CRC32 of the textparser table [1]
2 byte string length + string: bin file version (i.e. "Parse6")

Then the filelist[2] header:
2 byte string length + string: "Files1"
4 bytes filelist size
4 bytes number of filelist entries
Each entry:
    2 byte string length + string: filename
    4 bytes: 32-bit unix timestamp - file modify time

----- Actual data starts here -----


[1] This is the CRC32 of the binary table used by textparser to describe the data structure. It's used to determine when the table has changed from one version to the next and a bin file needs to be regenerated. However, if the text file isn't present, it will still attempt to load a bin with an incorrect value here - which may result in a crash if the format isn't really what the code is expecting.

[2] The filelist is just a list of text files that this particular bin file was built from. This isn't needed at all for loading data from bins and can be skipped. If both text and bin versions of the data are available, textparser will use this to determine if the bin file is up to date, or if it needs to be regenerated from the text files because some of them have changed.
« Last Edit: August 24, 2018, 03:29:04 AM by Codewalker »

Tahquitz

  • Titan Staff
  • Elite Boss
  • ****
  • Posts: 1,859
Re: Technical side discussion
« Reply #418 on: August 24, 2018, 04:36:16 AM »
 ??? (Tahquitz is now dizzy and goes back to his PHP work...)
"Work is love made visible." -- Khalil Gibran

slickriptide

  • Elite Boss
  • *****
  • Posts: 356
Re: Technical side discussion
« Reply #419 on: August 24, 2018, 03:43:17 PM »
??? (Tahquitz is now dizzy and goes back to his PHP work...)

Heh. Trust me, I feel the same way. This project has frequently put me in the position of the competent-but-not-particularly-gifted carpenter who says, "I need a house to live in, but there's nobody here to build it for me. How hard can it be?" without really considering the ramifications of wiring, plumbing, and building codes. Thankfully, people like Codewalker and Arcana are around. If I see further, it's because I've stood on the shoulders of giants and all that.

Everything in a bin file should be 4-byte aligned. Even strings need to be padded if they don't end on a 4-byte boundary.

Thanks for that confirmation. I'd begun to suspect that was the case after I ran into a whole series of property entries that seemed to have unexpected null bytes in between the strings and the following data. After seeing some entries where a number immediately followed a string without any null bytes at all, I'm thinking now that the "unnecessary" null terminator I mentioned on the strings is actually a case of byte padding and not a "terminator" at all.

Likewise - Thanks for the file header format. I'm especially happy to learn that the file list is basically noise. I saw one costume bin, I think it was, that had a big file list in the header and I wasn't at all sure what to make of it in comparison to something like the Atlas Park map bin.

That all makes me curious, though - Does the textparser still look for those original text files when the client launches? If someone was able to recreate a developer working directory, could you load your own versions of those text files?

I've mentioned Kaitai a couple of times. It's a parser-builder that sounds like maybe it's similar to whatever Boost is. I'm taking the geobin schema plus whatever extra details I've learned from examining the files and the info you've provided to write a schema that Kaitai uses to spitout a parser in any of a dozen or so different programming languages/environments. Analyzing game assets seems to be one of the bigger uses of it. The website is https://kaitai.io if you're curious.

One thing I'm NOT doing is writing a virtual file system. I just dumped everything out of the pigg files into a corresponding directory structure on disk. Most of the data that interests me from the bin files is stuff I intend to put into a SQL database anyway, so I don't see any mileage in trying to act like a pigg viewer or the ACTUAL game client on top of everything else.