Welcome, Guest. Please login or register.

Pages: [1] 2
Print
Author Topic: Large map performance bottleneck  (Read 18416 times)
mvBarracuda
Admin
Community member

Posts: 1308



View Profile Email
« on: November 24, 2009, 10:42:27 AM »

I played around with the map editor yesterday to create a template for the ground level map of the techdemo. The map was meant to be 250*250m (= 250*250 tiles) in size. That makes 62500 ground tiles. Thanks to beliar's little script, the tiles themselves were easy to add.

When I tested the map in the editor and ingame I found out that performance was really sluggish. It seems that such a large number of map instances seriously affect the performance of PARPG. The bottleneck seems to be FIFE-related but if we could provide some profiling data, chances are good that a FIFE developer takes a look into it and could improve the performance.

For details how to test large map performance yourself and how to profile FIFE, check out this ticket:
http://parpg-trac.cvsdude.com/parpg/ticket/197

Does anyone have profiling experience on Linux and would like to take a look into it?

It's highly unlikely that a FIFE dev will have the time to look into it before we want to ship techdemo 1 but if we provide the profiling data, performance should get improved in the long run nevertheless.

For the techdemo itself we might need to downscale the map size for now to avoid sluggish performance. Maybe a 125*125 ground tiles map would do it as well? That would be 1/4 of the number of instances of the 250*250m map. What do you think?
Logged
amo-ej1
Community member

Posts: 80


elie@de-brauwer.be
View Profile
« Reply #1 on: November 24, 2009, 03:11:53 PM »

i have some experience with this stuff i'll take a look at it soon and post my findings in trac/here.
Logged
mvBarracuda
Admin
Community member

Posts: 1308



View Profile Email
« Reply #2 on: November 24, 2009, 03:12:35 PM »

Thanks a bunch amo-ej1 :-) That's really appreciated.
Logged
Kaydeth
Community member

Posts: 185



View Profile Email
« Reply #3 on: November 24, 2009, 11:21:05 PM »

could you post the map file (and mabye some instructions on how to swap it into the game?) so that we can see how this effects other hardware setups?
Logged
zenbitz
Community member

Posts: 1164



View Profile
« Reply #4 on: November 25, 2009, 01:28:24 AM »

just replace the map name in settings.xml

I did a quick and dirty hack of the 250 x 250 into a 125 x 250 and FPS improved from ~20 to ~40.  FPS on the map.xml file is roughly 80 on my MacBookPro.

Seems to be scaling roughly linearly with number of tiles... which is probably a bug.

Logged

We are not denying them an ending...
We are denying them a DISNEY ending - Icelus
mvBarracuda
Admin
Community member

Posts: 1308



View Profile Email
« Reply #5 on: November 25, 2009, 02:15:48 AM »

could you post the map file (and mabye some instructions on how to swap it into the game?) so that we can see how this effects other hardware setups?
Instructions can be found here:
http://parpg-trac.cvsdude.com/parpg/ticket/197
Logged
amo-ej1
Community member

Posts: 80


elie@de-brauwer.be
View Profile
« Reply #6 on: November 25, 2009, 08:51:22 AM »

For my setup:
* Original map files loads in 500msec, runs at >150fps @1024x768 (windowed and fullscreen)
* Large map file loads in 4500 msecs, runs  20fps @1024x768 (windows and fullscreen)

(While looking at it I'll also add a profiling article in the wiki.)
Logged
mvBarracuda
Admin
Community member

Posts: 1308



View Profile Email
« Reply #7 on: November 25, 2009, 02:25:04 PM »

Coolio, that's appreciated amo-ej1. Keep us updated about your findings.
Logged
mvBarracuda
Admin
Community member

Posts: 1308



View Profile Email
« Reply #8 on: November 25, 2009, 10:45:13 PM »

Python script to convert the output of a bunch of different profiling tools into a dot graph:
http://code.google.com/p/jrfonseca/wiki/Gprof2Dot

Needs Python and graphviz installed. Might be actually useful for our purposes; kudos to prock who came up with the pointer.
Logged
shevegen
Community member

Posts: 705



View Profile
« Reply #9 on: November 26, 2009, 01:07:41 PM »

Any FIFE guy who has a look into this already by the way?
Logged

Cleaning away the bureaucracy in PARPG to make our life easier.
mvBarracuda
Admin
Community member

Posts: 1308



View Profile Email
« Reply #10 on: November 27, 2009, 12:21:08 PM »

As said above: it wouldn't pester them before we can provide any profiling information.
Logged
amo-ej1
Community member

Posts: 80


elie@de-brauwer.be
View Profile
« Reply #11 on: November 28, 2009, 04:46:59 PM »

Okay, step one has been taken care of, i've taken the liberty to write some technical document on when/how/why to profile applied on PARPG. This can be found at the wiki http://wiki.parpg.net/Profiling_PARPG (feedback  is welcome, patches is something you can do yourself ;-) ).

All that remains now is to actually use this information to do some profiling Wink
Logged
mvBarracuda
Admin
Community member

Posts: 1308



View Profile Email
« Reply #12 on: November 28, 2009, 06:53:22 PM »

Wow, great guide! Nice work amo-ej1. Added a link to it at the ticket.
Logged
amo-ej1
Community member

Posts: 80


elie@de-brauwer.be
View Profile
« Reply #13 on: November 28, 2009, 07:17:20 PM »

Well, I've done some measurements, basicly I ran oprofile for 20 seconds on a small map and for 20 seconds on a large map and I looked at the differences between both.

A first measurement I did was compare the framerate, the small map renders at 140 FPS, the large map at 23 FPS.

(Logfiles can be seen in trac  http://parpg-trac.cvsdude.com/parpg/attachment/ticket/197/bench_pre.txt and http://parpg-trac.cvsdude.com/parpg/attachment/ticket/197/bench_post.txt ).

When I look at the fat functions in the output I note the following:
* 11% CPU: FIFE:Camera::render()
* 9.8% CPU: FIFE::Instance::update()
* 5% CPU: FIFE::Visual2DGfx::isVisible() (inline function used only in FIFE::Camera::render() and only used to conclude that everything is visible ?)
* 4.2% CPU: FIFE::LogManager::instance() (the logging, this one can be easily disabled through use of a macro, but this get called once or twice per instance in FIFE::Camera::render() (starting to see a pattern ? )
* 3% CPU: FIFE::Instance::getVisual() (well, called in Camera::render() for well each visual ...).

Now I added some prints within FIFE:Camera::render() and basicly it iterates over all layers, there iterates over all instances, and collects the objects to render (takes care of animations etc etc) and then sorts the list of instances to render and renders them. But these counters counted the amount of objects per layer etc.
But the conclusion was that one layer contained 62500 instances, the other contained 697 instances, which means for each frame it iterates over 65000 elements, decides whether or not they should be rendered or require some form of updates. (in the ends it decides to render about 700 objects (the number of floor tiles is almost constant, the number of things on the other layer might differ).  (It also means that at a framerate of 25 FPS, the logmanager will be asked 25*65000 = 1.625 million times whether to log 'action' or 'no action').

Another example is given a 2GHz PC, say it can execute 2.000.000.000 instructions a second, then this would mean that each instance only has 2.000.000.000/(65000*25)=1230 instructions availabe. Which really isn't that much since this loop isn't the only part in the game .... it will stil require rendering etc etc.

So my initial conclusion would be:
a) we added way too much instances on one map (there is an early opt out thing for invisible instances, but everything with us is invisible)
b) the FIFE::Camera::render() loop isn't optimal.


Some wild ideas popping to my head:
-> the datastructures used in the render might be more efficient, instead of a vector, instances might be sorted in an efficient way, that the loop can be interrupted  (so that only 'likely to require work' things are checked
-> remove the logmanager calls from all critical paths (will probably be compiled out i assume in production, at least I hope Wink they already used macro's for it, but a regular scons build still uses them)
-> the big pain will be the amount of floor times, perhaps we should envisage a methode to make them en-masse invisible/visible, are there mechanisms for this already ?
...

I'd would really appreciate it if somebody could do some double checking and confirm what I just said here (or tell me a liar if i'm telling lies ;-) ).

But at this point my conclusion w
Logged
mvBarracuda
Admin
Community member

Posts: 1308



View Profile Email
« Reply #14 on: November 28, 2009, 07:38:42 PM »

Thanks for the detailed report amo-ej1! Could any other programmer try to verify this by profiling on his system?

I would like to forward the information to the FIFE devs but amo-ej1 and I think that some verification by a 2nd dev would be great first.
Logged
Pages: [1] 2
Print
Jump to: