Post-Apocalyptic RPG forums

Development => Programming => Topic started by: mvBarracuda on November 24, 2009, 10:42:27 AM



Title: Large map performance bottleneck
Post by: mvBarracuda on November 24, 2009, 10:42:27 AM
I played around with the map editor yesterday to create a template for the ground level map of the techdemo. The map was meant to be 250*250m (= 250*250 tiles) in size. That makes 62500 ground tiles. Thanks to beliar's little script, the tiles themselves were easy to add.

When I tested the map in the editor and ingame I found out that performance was really sluggish. It seems that such a large number of map instances seriously affect the performance of PARPG. The bottleneck seems to be FIFE-related but if we could provide some profiling data, chances are good that a FIFE developer takes a look into it and could improve the performance.

For details how to test large map performance yourself and how to profile FIFE, check out this ticket:
http://parpg-trac.cvsdude.com/parpg/ticket/197

Does anyone have profiling experience on Linux and would like to take a look into it?

It's highly unlikely that a FIFE dev will have the time to look into it before we want to ship techdemo 1 but if we provide the profiling data, performance should get improved in the long run nevertheless.

For the techdemo itself we might need to downscale the map size for now to avoid sluggish performance. Maybe a 125*125 ground tiles map would do it as well? That would be 1/4 of the number of instances of the 250*250m map. What do you think?


Title: Re: Large map performance bottleneck
Post by: amo-ej1 on November 24, 2009, 03:11:53 PM
i have some experience with this stuff i'll take a look at it soon and post my findings in trac/here.


Title: Re: Large map performance bottleneck
Post by: mvBarracuda on November 24, 2009, 03:12:35 PM
Thanks a bunch amo-ej1 :-) That's really appreciated.


Title: Re: Large map performance bottleneck
Post by: Kaydeth on November 24, 2009, 11:21:05 PM
could you post the map file (and mabye some instructions on how to swap it into the game?) so that we can see how this effects other hardware setups?


Title: Re: Large map performance bottleneck
Post by: zenbitz on November 25, 2009, 01:28:24 AM
just replace the map name in settings.xml

I did a quick and dirty hack of the 250 x 250 into a 125 x 250 and FPS improved from ~20 to ~40.  FPS on the map.xml file is roughly 80 on my MacBookPro.

Seems to be scaling roughly linearly with number of tiles... which is probably a bug.



Title: Re: Large map performance bottleneck
Post by: mvBarracuda on November 25, 2009, 02:15:48 AM
could you post the map file (and mabye some instructions on how to swap it into the game?) so that we can see how this effects other hardware setups?
Instructions can be found here:
http://parpg-trac.cvsdude.com/parpg/ticket/197


Title: Re: Large map performance bottleneck
Post by: amo-ej1 on November 25, 2009, 08:51:22 AM
For my setup:
* Original map files loads in 500msec, runs at >150fps @1024x768 (windowed and fullscreen)
* Large map file loads in 4500 msecs, runs  20fps @1024x768 (windows and fullscreen)

(While looking at it I'll also add a profiling article in the wiki.)


Title: Re: Large map performance bottleneck
Post by: mvBarracuda on November 25, 2009, 02:25:04 PM
Coolio, that's appreciated amo-ej1. Keep us updated about your findings.


Title: Re: Large map performance bottleneck
Post by: mvBarracuda on November 25, 2009, 10:45:13 PM
Python script to convert the output of a bunch of different profiling tools into a dot graph:
http://code.google.com/p/jrfonseca/wiki/Gprof2Dot

Needs Python and graphviz installed. Might be actually useful for our purposes; kudos to prock who came up with the pointer.


Title: Re: Large map performance bottleneck
Post by: shevegen on November 26, 2009, 01:07:41 PM
Any FIFE guy who has a look into this already by the way?


Title: Re: Large map performance bottleneck
Post by: mvBarracuda on November 27, 2009, 12:21:08 PM
As said above: it wouldn't pester them before we can provide any profiling information.


Title: Re: Large map performance bottleneck
Post by: amo-ej1 on November 28, 2009, 04:46:59 PM
Okay, step one has been taken care of, i've taken the liberty to write some technical document on when/how/why to profile applied on PARPG. This can be found at the wiki http://wiki.parpg.net/Profiling_PARPG (feedback  is welcome, patches is something you can do yourself ;-) ).

All that remains now is to actually use this information to do some profiling ;)


Title: Re: Large map performance bottleneck
Post by: mvBarracuda on November 28, 2009, 06:53:22 PM
Wow, great guide! Nice work amo-ej1. Added a link to it at the ticket.


Title: Re: Large map performance bottleneck
Post by: amo-ej1 on November 28, 2009, 07:17:20 PM
Well, I've done some measurements, basicly I ran oprofile for 20 seconds on a small map and for 20 seconds on a large map and I looked at the differences between both.

A first measurement I did was compare the framerate, the small map renders at 140 FPS, the large map at 23 FPS.

(Logfiles can be seen in trac  http://parpg-trac.cvsdude.com/parpg/attachment/ticket/197/bench_pre.txt and http://parpg-trac.cvsdude.com/parpg/attachment/ticket/197/bench_post.txt ).

When I look at the fat functions in the output I note the following:
* 11% CPU: FIFE:Camera::render()
* 9.8% CPU: FIFE::Instance::update()
* 5% CPU: FIFE::Visual2DGfx::isVisible() (inline function used only in FIFE::Camera::render() and only used to conclude that everything is visible ?)
* 4.2% CPU: FIFE::LogManager::instance() (the logging, this one can be easily disabled through use of a macro, but this get called once or twice per instance in FIFE::Camera::render() (starting to see a pattern ? )
* 3% CPU: FIFE::Instance::getVisual() (well, called in Camera::render() for well each visual ...).

Now I added some prints within FIFE:Camera::render() and basicly it iterates over all layers, there iterates over all instances, and collects the objects to render (takes care of animations etc etc) and then sorts the list of instances to render and renders them. But these counters counted the amount of objects per layer etc.
But the conclusion was that one layer contained 62500 instances, the other contained 697 instances, which means for each frame it iterates over 65000 elements, decides whether or not they should be rendered or require some form of updates. (in the ends it decides to render about 700 objects (the number of floor tiles is almost constant, the number of things on the other layer might differ).  (It also means that at a framerate of 25 FPS, the logmanager will be asked 25*65000 = 1.625 million times whether to log 'action' or 'no action').

Another example is given a 2GHz PC, say it can execute 2.000.000.000 instructions a second, then this would mean that each instance only has 2.000.000.000/(65000*25)=1230 instructions availabe. Which really isn't that much since this loop isn't the only part in the game .... it will stil require rendering etc etc.

So my initial conclusion would be:
a) we added way too much instances on one map (there is an early opt out thing for invisible instances, but everything with us is invisible)
b) the FIFE::Camera::render() loop isn't optimal.


Some wild ideas popping to my head:
-> the datastructures used in the render might be more efficient, instead of a vector, instances might be sorted in an efficient way, that the loop can be interrupted  (so that only 'likely to require work' things are checked
-> remove the logmanager calls from all critical paths (will probably be compiled out i assume in production, at least I hope ;) they already used macro's for it, but a regular scons build still uses them)
-> the big pain will be the amount of floor times, perhaps we should envisage a methode to make them en-masse invisible/visible, are there mechanisms for this already ?
...

I'd would really appreciate it if somebody could do some double checking and confirm what I just said here (or tell me a liar if i'm telling lies ;-) ).

But at this point my conclusion w


Title: Re: Large map performance bottleneck
Post by: mvBarracuda on November 28, 2009, 07:38:42 PM
Thanks for the detailed report amo-ej1! Could any other programmer try to verify this by profiling on his system?

I would like to forward the information to the FIFE devs but amo-ej1 and I think that some verification by a 2nd dev would be great first.


Title: Re: Large map performance bottleneck
Post by: shihonage on November 30, 2009, 09:44:32 PM
Yar, we had a similar problem with Shelter. The solution was to only draw what will actually be displayed on the screen, making renderer's performance viewarea-dependent instead of mapsize-dependent.


Title: Re: Large map performance bottleneck
Post by: mvBarracuda on December 01, 2009, 12:05:09 AM
Coolio shihonage :-)

It looks like the main problem is the rather messy FIFE view code so it's a bit hard to see where we actually should implement such an optimization and which consequences it will cause.

FIFE developer prock created a ticket for it and they'll try to look into it for their 0.3.2 release:
http://fife.trac.cvsdude.com/engine/ticket/419


Title: Re: Large map performance bottleneck
Post by: amo-ej1 on December 01, 2009, 08:05:04 AM
shihonage, I think that FIFE already renders what is supposed to be on the screen. The bottleneck however is figuring out _what_ should be rendered. Figuring out what should be rendered is iterating over a vector which is basicly O(n) and our n is getting very large. So there would be a more efficient implementation of that vector (e.g. early abort, sorting on spatial data, ...)

edit: the UH people came to the same conclusion as we did: http://logs.unknown-horizons.org/%23fife/%23fife.2009-Wed-22.log


Title: Re: Large map performance bottleneck
Post by: shihonage on December 01, 2009, 09:54:45 PM
Our map data is a 2D array with a struct for each cell. The renderer merely goes through the "viewable area", checking each cell for graphics or index of Actor standing on it. When our actors move, they erase the index from the old cell and put it in the new cell on which they stepped.

It's not perfect, and we have our glitches, but this approach minimizes CPU load.


Title: Re: Large map performance bottleneck
Post by: mvBarracuda on December 02, 2009, 04:11:22 PM
FIFE dev phoku is currently working on improving the view performance in a branch. See:
http://fife.trac.cvsdude.com/engine/changeset/3098
http://fife.trac.cvsdude.com/engine/browser/branches/active/view_performance


Title: Re: Large map performance bottleneck
Post by: mvBarracuda on December 04, 2009, 04:08:04 PM
Alrighty, there's news from this front!

Phoku recently continued to work on his view_performance branch and it's basically ready for a larger scale test at this point. There are still bugs here and there to iron out, so it won't be merged to trunk right away. As four eyes see more than two, the FIFE devs would appreciate if devs of FIFE-based games could test the branch in combination with their game to see if they can find any bugs.

That's how you can test it yourself: Follow the instructions outlined here:
http://wiki.parpg.net/Download

However don't check out the FIFE code from:
Quote
http://fife.svn.cvsdude.com/engine/trunk/

But from:
Quote
http://fife.svn.cvsdude.com/engine/branches/active/view_performance/

Obviously you have to build the view performance branch of FIFE as you would build the trunk (win32 users will need to move the win32 devkit files into the right folder). Furtermore you'll also need to check out the PARPG files into <view_performance>/clients/parpg/ so that PARPG actually runs with this branch of FIFE and not trunk.

I had the chance to test the branch last night and results vary quite a lot from system to system.

On my win32 desktop system, I had 20-40fps with the techdemo1_ground_level.xml profiling map with FIFE's trunk. The same map runs much smoother with the view_performance branch, somewhere between 80-160fps.

On my win32 notebook, performance has "only" slightly improved though. The view_performance branch runs about 10-25% faster than FIFE's trunk.

So the purpose of this post is twofold: to encourage you to test the view_performance branch to see if you can find any bugs and to hear about your performance reports. In case there are quite a number of devs who can test the branch and provide feedback if there are any obvious bugs left, the code could find its way into the FIFE trunk before we release techdemo 1 of PARPG.


Title: Re: Large map performance bottleneck
Post by: amo-ej1 on December 04, 2009, 07:12:31 PM
When I run the large map on fife trunk I get 20-30 fps, with the changes in trunk I get about 200 fps (but then again we might wish to add the possibility to render the fps string inside the hud somewhere for  more easy reference. (This is in 1024x768, not fullscreen)

The only remark I have is that now I see some artifacts (apar from the PC and NPC's starting to dans at the same locatio), these artifacts are not obviously reproducible and not screenshotable :( so you may either believe me or think that i was drunk while seeing them.

What I see is the following ,when the PC is moving, i think mainly when starting or ending a movement i sometimes see black lines which oultine a horizontal/vertical/diagonal cut through the scene but still following the tile outlines. This occurs about once a minute, but i haven't found a real way to trigger this ... i don't believe I ever saw this on the trunk (not in the past, not on trunk (of parpg and fife)).