Checkpoint Edge VPN – IPSec Tunnel not coming up properly

Just yesterday we encountered a problem with one of our VPN Sites that lost its VPN connectivity right when we wanted to go to lunch >;(

As we encounter glitches with Edges often we suspected the problem on the Edge’s end of the VPN Tunnel and not our central Checkpoint VPN Firewall Cluster. After an hour of frutiless Edge-Rebooting, vpn tu resets, removing and re-adding the edge to VPN Comunitys followed by endless policy installations, we noticed a lot of “Unknown SPI” entries in the logs when filtering on the Public IPs of both VPN Peers:

Bildschirmfoto 2013-01-16 um 00.02.25

Querying the Checkpoint KB we found sk36375 which states:

Repeated logs may indicate that the relevant kernel tables are full and new VPN-related data cannot be recorded.

However the SK is not really helpfull in explaining how to check if this is really the case. But the solutions at least states the two FW Kernel Tables in question: ‘vpn_queues’ and ‘IKE_SA_table’.

So we queried these tables using fw tab (as you might know from ‘fw tab -t connections -s”):

[Expert@GW:5]# fw tab -t IKE_SA_table
-------- IKE_SA_table --------
dynamic, id 367, attributes: keep, sync, kbuf 1, expires 3600, , hashsize 8192, implies 366, limit 1200
<00000000, 701daac2, c194d907, 1f72a622, e6c6630c; 031cf004, 00000000, 50f7818c, 00000004; 29233/86400>
<00000000, 23335609, b9cda941, 00425d38, d69deaba; 03b42806, 00000002, 50f7de56, 00000004; 52987/86400>
<00000000, 347a9a25, 29eb6b33, ac789b72, 89ceb056; 03fa0803, 00000002, 50f8100c, 00000004; 65713/86395>
<00000000, d5578d0d, 4519d3b5, 530c0906, d04d4a17; 033fd803, 00000000, 50f7144d, 083cfce0; 1266/1440>
<00000000, 96287d30, 1bf5a67c, ac804d62, a1317827; 03b18803, 00000002, 50f8286d, 00000004; 71954/86394>
<00000000, 96c6dfbd, b91ca581, 3a3bfedf, 84600a18; 03423806, 00000002, 50f80f86, 00000004; 65579/86400>
<00000000, f1d14c1b, 7e634b84, 259f54ad, 202aeabe; 03917006, 00000004, 50f7aed1, 00000004; 40822/86399>
<00000000, 913506bb, c25f8baf, e501cec6, 2b52666f; 03ebc006, 00000004, 50f74909, 00000004; 14766/86399>
<00000000, 2860a0ba, c3570502, 015a820f, 4b3862bb; 03528003, 00000004, 50f7ab03, 00000000; 39848/39905>
<00000000, dcd49307, 9f90921d, 2d67e09a, 13f68898; 03365800, 00000004, 50f7d238, e9f16320; 49885/86396>
<00000000, 0c7dc142, 5796eb81, 553d4a2b, e81efcd8; 03733800, 00000004, 50f7dd32, 00000004; 52695/86399>
<00000000, 3af2a649, 8983c6dd, ba01fdab, 5bd6112b; 03426800, 00000002, 50f7cc02, 00000004; 48295/86400>
<00000000, 55158ac9, c2cbc5f4, 0a742c4d, d69e2e94; 038e8003, 00000004, 50f851c6, 00000004; 82539/86399>
<00000000, bea80f86, b826724a, 7bb0ab12, a01d02c1; 032c9803, 00000004, 50f8529f, 00000004; 82756/86400>
<00000000, 8678ce33, 62c4180e, bd851e3e, 00742416; 03459803, 00000004, 50f82cf4, 00000004; 73113/86399>
<00000000, 272318b6, a718b7ad, 40b73a5b, adf3ba45; 036be003, 00000004, 50f83bfa, 00000004; 76959/86399>
…(1183 More)

As you can see at the beginning the Table Size is limited to 1200 Rows and  we had 16 Rows + 1183 More Rows which brought us to 1199 Table entries. So our problem was really a filled up Kernel table.

Doesn’t happen often that you search the Checkpoint KB for a Tracker Message and there is one hit and it addresses exactly your problem, does it?

So now we were left with figuring out what to do next: As we maybe have 40-50 Site to Site VPNs I could not imagine why we would need 1200 IPSec SAs for those. If you really have that much VPNs terminated on one GW you can go to the Cluster Object in SmartDashboard and Raise simultaneous IPSec Limit under Capacity Optimization. The Value entered there does not represent the Table limit 1:1 but the limit derives from what you enter there so just play around and see what fits for you.

But for us the issue was somewhere else. Running “vpn tu” and listing all IPSec and IKE SA’s showed that another Edge was constantly building up new SA’s and was in fact flooding our table. After removing the Edge from the Encryption Domain and pushing policy to disable VPN with that device the Table entries dropped to around 200.

So we exchanged the faulty Edge that was long overdue to get replaced by an N Model anyhow and our problem was fixed.

I hope this will help someone out there!



About SebastianB

read it in my blog
This entry was posted in Checkpoint and tagged , , , , , , , . Bookmark the permalink.

2 Responses to Checkpoint Edge VPN – IPSec Tunnel not coming up properly

  1. Cheshire712 says:

    Thank you for your article. You saved the day. Its people like you that make the tubes great :D

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s