Page 1 of 1

"Server lag"

Posted: Thu Nov 04, 2010 5:26 am
by Grans Remedy
I'm trying to figure out a lag issue on my server.

When connected to my server, in game lag for me is anywhere from 150 - 400 ms.

When I ping the server from my pc, response times are around 11-20 ms.

When I ssh into the server and check what processes are running (eg top) the server is mostly idle and bzfs shows up at less than 0.5% usage every now and then (no one else on the server).

So, why would the in-game lag be so much higher (than the ping times) and what can I do to lower it?

Any ideas anyone?

Cheers
Grans

Re: "Server lag"

Posted: Thu Nov 04, 2010 12:12 pm
by blast
The issue is probably your computer. If your computer runs the game slowly, this will introduce "artificial" lag. Our networking code runs between drawing frames. So if your computer spends too long drawing each frame, there is less time for everything else (simulation code, network code, input code, etc).

Re: "Server lag"

Posted: Thu Nov 04, 2010 9:57 pm
by Grans Remedy
Interesting idea - so presumably I could confirm this by turning off all effects in the display and there should be an improvement?

Re: "Server lag"

Posted: Fri Nov 05, 2010 9:11 am
by Grans Remedy
I'm getting some crazy results here.
bzfi0030.png
crazy lag
(47.37 KiB) Not downloaded yet
I turned off all the 3d display settings and effects (as you can see in the attached), but it really didn't have much effect.

Here is a few lines of pings to the server from my pc, while bzflag is running (just after taking the screenshot above)

Code: Select all

64 bytes from 60.234.72.60: icmp_seq=31 ttl=58 time=13.0 ms
64 bytes from 60.234.72.60: icmp_seq=32 ttl=58 time=40.5 ms
64 bytes from 60.234.72.60: icmp_seq=33 ttl=58 time=29.9 ms
64 bytes from 60.234.72.60: icmp_seq=34 ttl=58 time=18.1 ms
64 bytes from 60.234.72.60: icmp_seq=35 ttl=58 time=11.2 ms
64 bytes from 60.234.72.60: icmp_seq=36 ttl=58 time=11.2 ms
64 bytes from 60.234.72.60: icmp_seq=37 ttl=58 time=11.0 ms
64 bytes from 60.234.72.60: icmp_seq=38 ttl=58 time=11.1 ms
64 bytes from 60.234.72.60: icmp_seq=39 ttl=58 time=11.6 ms
64 bytes from 60.234.72.60: icmp_seq=40 ttl=58 time=12.2 ms
64 bytes from 60.234.72.60: icmp_seq=41 ttl=58 time=22.3 ms
64 bytes from 60.234.72.60: icmp_seq=42 ttl=58 time=31.3 ms
64 bytes from 60.234.72.60: icmp_seq=43 ttl=58 time=20.7 ms
64 bytes from 60.234.72.60: icmp_seq=44 ttl=58 time=11.1 ms
64 bytes from 60.234.72.60: icmp_seq=45 ttl=58 time=47.4 ms
64 bytes from 60.234.72.60: icmp_seq=46 ttl=58 time=11.3 ms
64 bytes from 60.234.72.60: icmp_seq=47 ttl=58 time=11.0 ms
64 bytes from 60.234.72.60: icmp_seq=48 ttl=58 time=11.0 ms
64 bytes from 60.234.72.60: icmp_seq=49 ttl=58 time=19.2 ms
64 bytes from 60.234.72.60: icmp_seq=50 ttl=58 time=42.4 ms
64 bytes from 60.234.72.60: icmp_seq=51 ttl=58 time=13.5 ms
64 bytes from 60.234.72.60: icmp_seq=52 ttl=58 time=78.3 ms
The only other symptom I've noticed is that my servers (3x bzf servers on one host) are periodically dropping off the public server list. If I restart them, they show up for a while then drop off again. When they've dropped off, players can still connect, but because I require registration the players cant spawn. So when the server is not on the list, global authentication seems to be broken. But they will sporadically appear on the list server again for a while, then drop off again. Weird.

The only recent change has been a change in IP address for the server, and a resulting change in the DNS record. But that was a couple of days ago so all upstream DNS servers should have the new iP by now.

I'm at a loss as to where to start looking for this - any ideas?

Thanks

Re: "Server lag"

Posted: Fri Nov 05, 2010 11:23 am
by ts

Code: Select all

mtr 60.234.72.60
                             My traceroute  [v0.75]
Neo.local (0.0.0.0)                                    Fri Nov  5 12:04:11 2010
Keys:  Help   Display mode   Restart statistics   Order of fields   quit
                                       Packets               Pings
 Host                                Loss%   Snt   Last   Avg  Best  Wrst StDev
 1. router                          0.0%    63    0.3   0.5   0.3   0.8   0.1
I'm hiding my provider :p
 5. 10gigabitethernet1-1.core1.lon1.  0.0%    63   31.5  33.3  30.7  42.4   3.3
 6. 10gigabitethernet4-4.core1.nyc4.  0.0%    62   99.3 100.8  98.3 109.1   3.1
 7. 10gigabitethernet5-3.core1.lax1.  0.0%    62  168.0 163.7 159.8 171.5   3.8
 8. pacnet.10gigabitethernet2-3.core  0.0%    62  170.3 170.1 169.5 170.9   0.3
 9. po6-0.gw1.akl1.asianetcom.net     0.0%    62  293.2 293.3 292.7 294.6   0.4
10. ORC-0014.GW1.AKL1.asianetcom.net  0.0%    62  287.3 288.1 286.9 318.8   4.0
11. 121.98.9.141                      0.0%    62  288.2 288.5 287.3 299.1   1.7
12. hd-bdr2.hostingdirect.co.nz      98.4%    62  306.9 306.9 306.9 306.9   0.0
13. 60.234.72.60                      0.0%    62  287.6 287.7 287.0 289.6   0.5
You may check your routing - perhaps the problem is not at your point. My routing could be better at least, considering it's going over London to NYC.


Let's flood ICMP-packets to the destination IP-address

Code: Select all

bash-3.2$ sudo ping -f -c 100 60.234.72.60
Password:
...
--- 60.234.72.60 ping statistics ---
100 packets transmitted, 54 packets received, 46.0% packet loss
round-trip min/avg/max/stddev = 286.940/287.884/289.620/0.606 ms
Flooding has the problem of probably using more bandwidth than available. Try it slowly.

Code: Select all

Neo:~ ts$ ping -c 100 60.234.72.60
...
--- 60.234.72.60 ping statistics ---
100 packets transmitted, 100 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 286.726/287.658/288.789/0.389 ms
You could also check if you have unwanted connections at your home box using lsof -i for instance. The packet loss in the screenshot (BTW: your colors are way different than mine) could be a sign of bandwidth saturation (see my flood ping example, the raised lag at that time in-game could come from TCP/IP's error correction).

I'd suggest to find out what is going on step by step. Don't combine things like running the client and pinging at the same time. Also keep in mind the client uses UDP most of the time instead of ICMP.

Of course do not use WLAN during the tests. ;)

Re: "Server lag"

Posted: Fri Nov 05, 2010 2:14 pm
by blast
Dropping all of the settings to the lowest will generally make the game run even slower. The low options are optimized for software rendering and many modern hardware/drivers no longer optimize for those drawing methods. You should check your FPS in the game by pressing 't' and looking just above the mouse box. You should aim for at least 60 FPS (but less than 200).

Re: "Server lag"

Posted: Sun Nov 07, 2010 9:07 am
by Grans Remedy
Ok thanks Blast,
I updated the server to 2.0.16 to see if that made any difference - which it didn't. I've had a couple of lag spikes. Keeping the 3d effects at a medium level does give me between 120 - 190 fps without affecting my jitter to much (but its difficult to tell).

Tracing from my pc to my server I get a similar percentage of lost packets from hostingdirect.co.nz as you - I forwarded that info onto my hosting provider but they pointed out that it is harmless as there is no loss at my server. So not a lot of progress there.

Do you have any idea why my bzf servers are popping onto the list serv and off again?

Re: "Server lag"

Posted: Sun Nov 07, 2010 10:06 pm
by blast
If your server is dropping off the list, it's probably issues with the network between the two servers. If the list server cannot connect to your server, it assumes it is not online. And if your server is not very busy, there may only be a single 'ADD' message sent before the removal timeout, and if that fails, it won't be 'renewed'.

Re: "Server lag"

Posted: Mon Nov 08, 2010 7:19 am
by Grans Remedy
The weird thing is that it does this even mid-game, when there are five or six players on. All of a sudden players cant join because they cant authenticate. Can you tell me the details of how the list server connects (special port or something?) - I'd like to pass that onto my hosting provider.
Thanks

Re: "Server lag"

Posted: Mon Nov 08, 2010 12:14 pm
by macsforme
Do you have lag problems on other servers farther away from you? If not, then I doubt your client is at fault...

Re: "Server lag"

Posted: Mon Nov 08, 2010 3:21 pm
by Bullet Catcher
The list server connects via TCP to your game server's advertised port, the same as any client. On my server the interval is irregular, ranging from 30 minutes to 6 hours. For example (times are UTC):

Code: Select all

2010-11-07 04:18:07: Player [0] resolved to BZ.BZFLAG.BZ
2010-11-07 05:18:10: Player [0] resolved to BZ.BZFLAG.BZ
2010-11-07 11:18:26: Player [0] resolved to BZ.BZFLAG.BZ
2010-11-07 11:48:29: Player [0] resolved to BZ.BZFLAG.BZ
This suggests that there may be a problem closer to the list server, but if there is any type of firewall or NAT near the game server it is the first place to look.

Re: "Server lag"

Posted: Mon Nov 08, 2010 6:27 pm
by Bullet Catcher
Bah. Searching for 63.246.136.16 instead of BZ.BZFLAG.BZ in my game server log shows that the list server connects every half hour like clockwork. The small number of successful "resolved to" entries just means that DNS takes longer than it does to check the state of the game server.

Grans, how regular are the connections from 63.246.136.16 in your server log?

Re: "Server lag"

Posted: Mon Nov 08, 2010 8:36 pm
by Grans Remedy
@const - no I haven't experienced this sort of issue on any of the servers I play on.

@BC - seems irregular, sometimes every few minutes, other times 30 - 60 mins or more. See below.

Code: Select all

grep -A 1 63.246.136.16 guhixofi.log 
Player [0] accept() from 63.246.136.16:53213 on 6
Player  [0] removed at 2010-11-07 15:24:44: Disconnected
--
Player [0] accept() from 63.246.136.16:57540 on 6
Player  [0] removed at 2010-11-07 16:54:47: Disconnected
--
Player [0] accept() from 63.246.136.16:58797 on 6
Player  [0] removed at 2010-11-07 17:24:47: Disconnected
--
Player [0] accept() from 63.246.136.16:59885 on 6
Player  [0] removed at 2010-11-07 17:54:50: Disconnected
--
Player [0] accept() from 63.246.136.16:62045 on 6
Player  [0] removed at 2010-11-07 18:54:53: Disconnected
--
Player [1] accept() from 63.246.136.16:62587 on 7
Player  [1] removed at 2010-11-07 19:10:07: Disconnected
--
Player [1] accept() from 63.246.136.16:62588 on 7
Player  [1] removed at 2010-11-07 19:10:09: Disconnected
Player [1] accept() from 63.246.136.16:62648 on 6
Player  [1] removed at 2010-11-07 19:11:21: Disconnected
--
Player [2] accept() from 63.246.136.16:62666 on 8
Player  [2] removed at 2010-11-07 19:12:09: Disconnected
--
Player [2] accept() from 63.246.136.16:62668 on 8
Player  [2] removed at 2010-11-07 19:12:09: Disconnected
--
Player [1] accept() from 63.246.136.16:62687 on 7
Player  [1] removed at 2010-11-07 19:13:05: Disconnected
--
Player [2] accept() from 63.246.136.16:62797 on 8
Player  [2] removed at 2010-11-07 19:17:03: Disconnected
--
Player [2] accept() from 63.246.136.16:62800 on 8
Player  [2] removed at 2010-11-07 19:17:10: Disconnected
--
Player [3] accept() from 63.246.136.16:62823 on 9
Player  [3] removed at 2010-11-07 19:17:48: Disconnected
Player [3] accept() from 63.246.136.16:62824 on 9
Player  [3] removed at 2010-11-07 19:17:50: Disconnected
--
Player [2] accept() from 63.246.136.16:62826 on 8
Player  [2] removed at 2010-11-07 19:17:51: Disconnected
--
Player [3] accept() from 63.246.136.16:62829 on 9
Player  [3] removed at 2010-11-07 19:17:59: Disconnected
Player [3] accept() from 63.246.136.16:62830 on 9
Player  [3] removed at 2010-11-07 19:17:59: Disconnected
--
Player [2] accept() from 63.246.136.16:62833 on 8
Player  [2] removed at 2010-11-07 19:18:01: Disconnected
--
Player [1] accept() from 63.246.136.16:62837 on 7
Player  [1] removed at 2010-11-07 19:18:08: Disconnected
--
Player [0] accept() from 63.246.136.16:62849 on 6
Player  [0] removed at 2010-11-07 19:18:25: Disconnected
Player [0] accept() from 63.246.136.16:62853 on 6
Player  [0] removed at 2010-11-07 19:18:27: Disconnected
--
Player [0] accept() from 63.246.136.16:50241 on 6
Player  [0] removed at 2010-11-07 21:18:31: Disconnected
--
Player [0] accept() from 63.246.136.16:51288 on 6
Player  [0] removed at 2010-11-07 21:48:31: Disconnected
--
Player [0] accept() from 63.246.136.16:52355 on 6
Player  [0] removed at 2010-11-07 22:18:35: Disconnected
--
Player [0] accept() from 63.246.136.16:55988 on 6
Player  [0] removed at 2010-11-07 23:48:36: Disconnected
--
Player [0] accept() from 63.246.136.16:57145 on 6
Player  [0] removed at 2010-11-08 00:18:37: Disconnected
--
Player [0] accept() from 63.246.136.16:58529 on 6
Player  [0] removed at 2010-11-08 00:48:36: Disconnected
--
Player [0] accept() from 63.246.136.16:59889 on 6
Player  [0] removed at 2010-11-08 01:18:38: Disconnected
--
Player [0] accept() from 63.246.136.16:61465 on 6
Player  [0] removed at 2010-11-08 01:48:42: Disconnected
--
Player [0] accept() from 63.246.136.16:63154 on 6
Player  [0] removed at 2010-11-08 02:18:38: Disconnected
--
Player [0] accept() from 63.246.136.16:65075 on 6
Player  [0] removed at 2010-11-08 02:48:38: Disconnected
--
Player [0] accept() from 63.246.136.16:50488 on 6
Player  [0] removed at 2010-11-08 03:18:41: Disconnected
--
Player [0] accept() from 63.246.136.16:52061 on 6
Player  [0] removed at 2010-11-08 03:48:43: Disconnected
--
Player [0] accept() from 63.246.136.16:53954 on 6
Player  [0] removed at 2010-11-08 04:18:45: Disconnected
--
Player [0] accept() from 63.246.136.16:56091 on 6
Player  [0] removed at 2010-11-08 04:48:47: Disconnected
--
Player [0] accept() from 63.246.136.16:58242 on 6
Player  [0] removed at 2010-11-08 05:18:47: Disconnected
--
Player [0] accept() from 63.246.136.16:62877 on 6
Player  [0] removed at 2010-11-08 06:18:54: Disconnected
--
Player [0] accept() from 63.246.136.16:65096 on 6
Player  [0] removed at 2010-11-08 06:48:53: Disconnected
--
Player [0] accept() from 63.246.136.16:55332 on 6
Player  [0] removed at 2010-11-08 08:18:59: Disconnected
--
Player [0] accept() from 63.246.136.16:59000 on 6
Player  [0] removed at 2010-11-08 09:19:02: Disconnected
--
Player [0] accept() from 63.246.136.16:60806 on 6
Player  [0] removed at 2010-11-08 09:49:03: Disconnected
--
Player [0] accept() from 63.246.136.16:62399 on 6
Player  [0] removed at 2010-11-08 10:19:06: Disconnected
--
Player [0] accept() from 63.246.136.16:49638 on 6
Player  [0] removed at 2010-11-08 11:19:09: Disconnected
--
Player [0] accept() from 63.246.136.16:53080 on 6
Player  [0] removed at 2010-11-08 12:19:12: Disconnected
--
Player [0] accept() from 63.246.136.16:57864 on 6
Player  [0] removed at 2010-11-08 13:49:16: Disconnected
--
Player [0] accept() from 63.246.136.16:59304 on 6
Player  [0] removed at 2010-11-08 14:19:18: Disconnected
--
Player [0] accept() from 63.246.136.16:62704 on 6
Player  [0] removed at 2010-11-08 15:19:23: Disconnected
--
Player [0] accept() from 63.246.136.16:64242 on 6
Player  [0] removed at 2010-11-08 15:49:21: Disconnected
--
Player [0] accept() from 63.246.136.16:65492 on 6
Player  [0] removed at 2010-11-08 16:19:23: Disconnected
--
Player [0] accept() from 63.246.136.16:50203 on 6
Player  [0] removed at 2010-11-08 16:49:26: Disconnected
--
Player [0] accept() from 63.246.136.16:51359 on 6
Player  [0] removed at 2010-11-08 17:19:27: Disconnected
--
Player [0] accept() from 63.246.136.16:52407 on 6
Player  [0] removed at 2010-11-08 17:49:28: Disconnected
--
Player [0] accept() from 63.246.136.16:53429 on 6
Player  [0] removed at 2010-11-08 18:19:30: Disconnected
--
Player [0] accept() from 63.246.136.16:54508 on 6
Player  [0] removed at 2010-11-08 18:49:31: Disconnected
--
Player [0] accept() from 63.246.136.16:57737 on 6
Player  [0] removed at 2010-11-08 20:19:34: Disconnected
--
Player [0] accept() from 63.246.136.16:58824 on 6
Player  [0] removed at 2010-11-08 20:49:35: Disconnected
--
Player [0] accept() from 63.246.136.16:60936 on 6
Player  [0] removed at 2010-11-08 21:49:39: Disconnected
--
Player [0] accept() from 63.246.136.16:61818 on 6
Player  [0] removed at 2010-11-08 22:19:40: Disconnected
--
Player [0] accept() from 63.246.136.16:62825 on 6
Player  [0] removed at 2010-11-08 22:49:41: Disconnected
--
Player [0] accept() from 63.246.136.16:63785 on 6
Player  [0] removed at 2010-11-08 23:19:41: Disconnected
--
Player [0] accept() from 63.246.136.16:64845 on 6
Player  [0] removed at 2010-11-08 23:49:43: Disconnected
--
Player [0] accept() from 63.246.136.16:49401 on 6
Player  [0] removed at 2010-11-09 00:19:43: Disconnected
--
Player [0] accept() from 63.246.136.16:51298 on 6
Player  [0] removed at 2010-11-09 01:19:47: Disconnected
--
Player [0] accept() from 63.246.136.16:54738 on 6
Player  [0] removed at 2010-11-09 02:49:51: Disconnected
--
Player [0] accept() from 63.246.136.16:61334 on 6
Player  [0] removed at 2010-11-09 04:49:54: Disconnected
--
Player [0] accept() from 63.246.136.16:62758 on 6
Player  [0] removed at 2010-11-09 05:19:55: Disconnected
--
Player [0] accept() from 63.246.136.16:51354 on 6
Player  [0] removed at 2010-11-09 06:50:01: Disconnected
--
Player [0] accept() from 63.246.136.16:53472 on 6
Player  [0] removed at 2010-11-09 07:20:03: Disconnected
--
Player [0] accept() from 63.246.136.16:55257 on 6
Player  [0] removed at 2010-11-09 07:50:04: Disconnected
--
Player [0] accept() from 63.246.136.16:58832 on 6
Player  [0] removed at 2010-11-09 08:50:09: Disconnected
I've also noticed a whole pile of connections from 188.72.225.189 is that some other list server?
My hosting provider has pretty much given up - they say the network connection to the server is fine (its in their data center).

Re: "Server lag"

Posted: Mon Nov 08, 2010 9:37 pm
by ts
Grans Remedy wrote:Tracing from my pc to my server I get a similar percentage of lost packets from hostingdirect.co.nz as you - I forwarded that info onto my hosting provider but they pointed out that it is harmless as there is no loss at my server.
They are right, the loss at the second last hop is not important in this case. However it would be interesting to see the result of mtr here.
Grans Remedy wrote:Do you have any idea why my bzf servers are popping onto the list serv and off again?
That might be a different problem.
Grans Remedy wrote:I've also noticed a whole pile of connections from 188.72.225.189 is that some other list server?
The GU league website has a built-in servertracker that queries the official GU league match servers. The server that hosts the website has this IPv4-address. If you wish I could remove your server entry from the tracker.


I suggest to follow Constitution's advice and find out if you only have problems on your own server, e.g. are connections to other servers ok?

Re: "Server lag"

Posted: Mon Nov 08, 2010 11:27 pm
by blast
The game server sends out an 'ADD' every 30 minutes at a minimum (each player join/part also sends one). Every 'ADD' message sent to the list results in the list connecting back to your server. If the list server has not had a successful 'ADD' for a server in 35 minutes, it will prune it from the list. So if your server is having a 60 minute gap between successful connects from the list, this is why it is not showing up anymore.

Re: "Server lag"

Posted: Mon Nov 15, 2010 5:22 pm
by Bullet Catcher
Grans, the my.bzflag.org server logs show that your ADD messages are not always getting through. If you are still having this problem, try using debug level 3 ("bzfs -ddd") to enable "Queuing ADD message to list server" messages in your logs. If you get those every 30 minutes, but not always an immediate incoming connection from the list server at 63.246.136.16, then your outgoing connection to the list server is failing for some reason.

DNS trouble is a likely cause. Check your /etc/resolv.conf file, which may need to be updated for your server's new IP address.

Re: "Server lag"

Posted: Mon Jan 10, 2011 11:04 pm
by Grans Remedy
Grans Remedy wrote: The only recent change has been a change in IP address for the server, and a resulting change in the DNS record. But that was a couple of days ago so all upstream DNS servers should have the new iP by now.
I'm pleased to say that my servers are back to their usual stability.
Although my hosting provider did not agree with me on this, the fact is that when they removed the secondary IP address for the host server everything went back to normal.
Thanks for helping me chase this down.
Cheers
Grans

Re: "Server lag"

Posted: Fri Jan 14, 2011 5:09 am
by Bullet Catcher
With two network interfaces on the local machine, bzfs will sometimes use one and other times use the other. Use the -i command line option to specify the local network address you want it to use all the time.