Optimizing cache infrastructure

I love when engineering teams share their tricks of trade for other organizations to benefit. While this might seem counter-intuitive, sharing knowledge makes the entire ecosystem better.

Etsy‘ engineering team does a great job of publishing their architecture, methodologies and code at https://codeascraft.com.

This particular article on how they optimize their caching infrastructure (https://codeascraft.com/2017/11/30/how-etsy-caches/) is pretty enlightening. I always thought the best method to load balance objects (app hits, cache requests, queues etc) to hosts was to use mod operations. In this blog post Etsy’ team talk about using consistent hashing instead of modulo hashing.

At a high level, it allows cache nodes to fail and not impact the overall performance of the application drastically in addition to making it easy to scale the number of nodes. This method is useful when you have a large amount of cache nodes.

More reference links

  • http://www.tom-e-white.com/2007/11/consistent-hashing.html
  • https://www.toptal.com/big-data/consistent-hashing
  • https://en.wikipedia.org/wiki/Consistent_hashing

 

HOW TO : Configure nginx for WordPress permalinks

Over the last week, I moved this blog from a LAMP (Linux, Apache, MySQL, PHP) stack to LEMP (Linux, Nginx, MySQL, PHP) stack. Have a blog post in the works with all the gory details, but wanted to quick document a quirk in the WordPress + Nginx combination that broke permalinks on this site.

Permalinks are user friendly permanent static URLs for a blog post. So for example this particular blog post’ URL is

https://kudithipudi.org/2017/02/24/how-to-configure…press-permalinks/

instead of

https://kudithipudi.org/?p=1762

This works by default in Apache because WordPress puts in the required rewrite rules.

To get it work in Nginx, you have to add the following config in the Nginx site configuration

Under the / location context, add the following

try_files $uri $uri/ /index.php?$args;

This is essentially telling Nginx to try to display the URI as is, and if it fails that, pass the URI as an argument to index.php.

HOW TO : Query varnishlogs for requests with 404 responses

varnishlog, one of the tools provided with varnish cache, uses VSL Query Expressions (https://www.varnish-cache.org/docs/trunk/reference/vsl-query.html) to provide some powerful insights into the requests and responses.

Here is a how you can use varnishlog to show all client requests that are ending up with a 404 response.

sudo varnishlog -g request -i ReqURL -q "BerespStatus != 200"

Technically, this particular query shows all client requests with a response other than 200.

Breaking down the commands

-g request : shows all entries related to the request

-i ReqURL : forces varnishlog to only display the Requesting URL

-q “BerespStatus != 200” : query filter to only match non 200 responses. Note that the query has to be enclosed in “”.

HOW TO : Restrict access to proxied content in Apache

If you are using the mod_proxy feature in Apache to forward requests for certain content to a backend server, but want to restrict access to that content to clients originating from certain IP addresses, you can use the location feature in Apache.

The Location directive limits the scope of the enclosed directives by URL. This is very similar to the Directory directive, but the difference is that you can put controls based on the URL rather than the location of the content.

In this example, I am forwarding content destined to http://kudithipudi.org/testLocation to an internal server at http://127.0.0.1:8080/testLocation. I am going to use the Location directive to restrict access to just requests originating from IP Address 10.10.10.10

[code]

<Location /testLocation>
Order Deny,Allow
Deny from all
Allow from 10.10.10.10
</Location>

ProxyPass /testLocation http://127.0.0.1:8080/testLocation
ProxyPassReverse /testLocation http://127.0.0.1:8080/testLocation [/code]

 

HOW TO : Use curl to check the impact of DNS changes

Ran into an interesting scenario at work today. We had to check the impact of a DNS change on a certain hostname. Normally, you would edit your host file entry to reflect the DNS change and do your testing. Here is another way you can do it using cURL. In this particular example, I am checking the SSL certificate details of the hostname .

[code]curl –insecure –trace-ascii debug.txt https://HOSTNAME:PORT –resolve HOSTNAME:PORT:IP_ADDRESS [/code]

That’s a pretty convoluted command :). Let’s try to break it down

[code]–insecure [/code]

: tells cURL to ignore certificate warnings. This is helpful if you are using self signed certs

[code]–trace-ascii [/code]

: tells cURL to save the SSL connection details (in debug mode) to a file called debug.txt

[code]–resolve [/code]

: tells cURL to use the options mentioned after it to resolve the hostname, rather than using DNS. The format for resolve is <host:port:address>

NOTE: You need to have version 7.21.3 or higher of cURL to use this option

Here’s a real world example. Say, I want to see how the IP address 72.30.38.140 would reacts if www.google.com requests are routed to it

[code]

[email protected]:~$ curl –insecure –trace-ascii debug.txt https://www.google.com –resolve www.google.com:443:72.30.38.140
The document has moved <A HREF="http://www.google.com/?s=https">here</A>.<P>
<!– ir2.fp.sp2.yahoo.com uncompressed/chunked Mon Nov 12 22:44:41 UTC 2012 –>
[email protected]:~$ more debug.txt
== Info: Added www.google.com:443:72.30.38.140 to DNS cache
== Info: About to connect() to www.google.com port 443 (#0)
== Info: Trying 72.30.38.140… == Info: connected
== Info: Connected to www.google.com (72.30.38.140) port 443 (#0)
== Info: successfully set certificate verify locations:
== Info: CAfile: none
CApath: /etc/ssl/certs
== Info: SSLv3, TLS handshake, Client hello (1):
=> Send SSL data, 223 bytes (0xdf)
0000: ……P.|v..1..kA…….=J.xr.=ft.3.|…Z…..9.8………5…..
0040: …………….3.2…..E.D…../…A………………………
0080: …….W………www.google.com………..4.2……………….
00c0: ………………………….
== Info: SSLv3, TLS handshake, Server hello (2):
<= Recv SSL data, 42 bytes (0x2a)
0000: …&..P.{.I"L….3x..N…9…./<n….A..5.
== Info: SSLv3, TLS handshake, CERT (11):
<= Recv SSL data, 1272 bytes (0x4f8)
0000: ……….0…0..S……….0…*.H……..0N1.0…U….US1.0…
0040: U….Equifax1-0+..U…$Equifax Secure Certificate Authority0…1
0080: 00401230014Z..150703045000Z0..1)0′..U… 2g8aO5wI1bKJ2ZD588UsLvD
00c0: e3gTbg8DU1.0…U….US1.0…U….California1.0…U….Sunnyvale1
0100: .0…U….Yahoo Inc.1.0…U….www.yahoo.com0.."0…*.H……..
0140: …..0……….5.p./……..O…k.C…9E+.J..H.s….Bm.T.E.-..<
0180: ^…m…r.v<\…&Qq..l………. @'(q.m..ZJ.*kt…!.AWU…….M.
01c0: …n…O….0.._…H….4……>.m..K…….Z…:.Df%.lR.!…(!.
0200: .FV.dQ…f.V….P,.J9.c..dM.s>C=….Y..#…47#2…..cP.{….g.rU
0240: .d…P……………..0…0…U………..0…U………….t5.
0280:……U..0:..U…3010/.-.+.)http://crl.geotrust.com/crls/secure
02c0: ca.crl0..[..U…..R0..N..www.yahoo.com..yahoo.com..us.yahoo.com.
0300: .kr.yahoo.com..uk.yahoo.com..ie.yahoo.com..fr.yahoo.com..in.yaho
0340: o.com..ca.yahoo.com..br.yahoo.com..de.yahoo.com..es.yahoo.com..m
0380: x.yahoo.com..it.yahoo.com..sg.yahoo.com..id.yahoo.com..ph.yahoo.
03c0: com..qc.yahoo.com..tw.yahoo.com..hk.yahoo.com..cn.yahoo.com..au.
0400: yahoo.com..ar.yahoo.com..vn.yahoo.com0…U.#..0…H.h.+….G.# .
0440: O3….0…U.%..0…+………+…….0…*.H……………2..0.
0480: S.’.y….GD.Q…=…K+..q..kv…….<h…….ZLE.h$..M2^.C..IT..
04c0: ".5j….Vc7.4……1.Wu.[.a>+………9..{.a:………
== Info: SSLv3, TLS handshake, Server finished (14):
<= Recv SSL data, 4 bytes (0x4)
0000: ….
== Info: SSLv3, TLS handshake, Client key exchange (16):
=> Send SSL data, 262 bytes (0x106)
0000: …….R…..b.,.&.. s.Ob;.E_.EnSw../D…’…..(aB<<……F..]..
0040: o………~…*..r?.C..%..22…J.bu&.x(j|…….>A5..OF.G…C.$.
0080: .9u9n.z…K…..u…..~:W.{Sii.{2..6……..<…..i…8y$y…..6
00c0: …1.(M…fx….#k..r….47..t.q…..A.?.0. .D…..~…G+.,….~
0100: ..=.#y
== Info: SSLv3, TLS change cipher, Client hello (1):
=> Send SSL data, 1 bytes (0x1)
0000: .
== Info: SSLv3, TLS handshake, Finished (20):
=> Send SSL data, 16 bytes (0x10)
0000: ….!9)…6…+.
== Info: SSLv3, TLS change cipher, Client hello (1):
<= Recv SSL data, 1 bytes (0x1)
0000: .
== Info: SSLv3, TLS handshake, Finished (20):
<= Recv SSL data, 16 bytes (0x10)
0000: …..(qN..l.]…
== Info: SSL connection using AES256-SHA
== Info: Server certificate:
== Info: subject: serialNumber=2g8aO5wI1bKJ2ZD588UsLvDe3gTbg8DU; C=US; ST=California; L=Sunnyvale; O=Yahoo Inc.; CN=www.yahoo.com
== Info: start date: 2010-04-01 23:00:14 GMT
== Info: expire date: 2015-07-03 04:50:00 GMT
== Info: subjectAltName does not match www.google.com
=> Send header, 167 bytes (0xa7)
0000: GET / HTTP/1.1
0010: User-Agent: curl/7.21.6 (x86_64-pc-linux-gnu) libcurl/7.21.6 Ope
0050: nSSL/1.0.0e zlib/1.2.3.4 libidn/1.22 librtmp/2.3
0082: Host: www.google.com
0098: Accept: */*
00a5:
<= Recv header, 32 bytes (0x20)
0000: HTTP/1.1 301 Moved Permanently
<= Recv header, 37 bytes (0x25)
0000: Date: Mon, 12 Nov 2012 22:44:41 GMT
<= Recv header, 42 bytes (0x2a)
0000: Location: http://www.google.com/?s=https
<= Recv header, 23 bytes (0x17)
0000: Vary: Accept-Encoding
<= Recv header, 19 bytes (0x13)
0000: Connection: close
<= Recv header, 28 bytes (0x1c)
0000: Transfer-Encoding: chunked
<= Recv header, 40 bytes (0x28)
0000: Content-Type: text/html; charset=utf-8
<= Recv header, 24 bytes (0x18)
0000: Cache-Control: private
<= Recv header, 2 bytes (0x2)
0000:
<= Recv data, 173 bytes (0xad)
0000: 000009d
0009: The document has moved <A HREF="http://www.google.com/?s=https">
0049: here</A>.<P>.<!– ir2.fp.sp2.yahoo.com uncompressed/chunked Mon
0089: Nov 12 22:44:41 UTC 2012 –>.
00a8: 0
00ab:
== Info: Closing connection #0
== Info: SSLv3, TLS alert, Client hello (1):
=> Send SSL data, 2 bytes (0x2)
0000: ..

[/code]

Interesting (infrastructure) tidbits about Microsoft Azure

I attended a session organized by aditi regarding Microsoft Azure and Windows 8, called “Go Cloud 8” today. One of the speakers in the event was Deepak Rao, Microsoft’ Director of Cloud Computing. He shared some interesting numbers about the infrastructure running Microsoft Azure

  • 8 carrier grade data centers around the world. “Carrier” grade because of the sheer size of them.
  • The data center in Chicago houses more than 350,000 servers and is supported by only 30 FTEs (which makes me think about the number of contractors they have there 🙂 )
  • 1 in 4 x86 servers produced were bought by Microsoft. Not sure if it was in 2011 or 2012!!

Deepak also gave an real world example of how one of their customers used Azure.

BPro Inc provides software to counties and states for helping report election results. They run their backend on the Azure platform. During normal periods, they run ~10 instances of compute nodes. But during the election day (11/6) this week, BPro spun up 8600 compute nodes in less than 15 minutes at 4:00 PM EST, to help support the load created by the demand for election results and than again shutdown all of them at around 1:00 AM EST when the demand decreased. Using the “list” pricing of $0.12/hr/compute node, that massive increase in capacity cost them ~$8K!!.

That is pretty impressive and I usually don’t use the work impressive in the same sentence as Microsoft 🙂

HOW TO : Download SSL certificate using openssl and importing it into a keystore

Following up on my earlier post about using keytool to import and export certificates into a keystore. Here is some more information on using openssl to download the certificate from a remote server and then using keytool to import it into the keystore.

keytool needs the certificate to be in X509 format, so we will use sed to format the certificate.

[code]echo -n | openssl s_client -connect HOST:PORTNUMBER | sed -ne ‘/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p’ > /tmp/$SERVERNAME.cert [/code]

breaking down the command

[code]echo -n[/code]

send an end of line signal to openssl. This allows openssl (or rather the server it is trying to connect to) to disconnect the session

[code]openssl s_client -connect HOST:PORTNUMBER[/code]

asks openssl to act as a client and connect to the HOST on the specificed PORTNUMBER

[code]sed -ne ‘/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p’ [/code]

asks sed to take the input from openssl and only output the content between BEGIN CERTIFICATE and END CERTIFICATE.

NOTE: If you get an error like “SSL routines:SSL23_GET_SERVER_HELLO:sslv3 alert unexpected message”, it means the server doesn’t support SSL negotation. Using the command option -no_tls1 helps work around this error. This option will tell openssl to disable TLS1 negotiation.