Technology

How data streams work (AKA queue design)

Good blog post by Timothy Downs on how queues and data streams work with a layman example at https://hackernoon.com/introduction-to-redis-streams-133f1c375cd3

Quoting the example here

We have a very long book which we would like many people to read. Some can read during their lunch hour, some read on Monday nights, others take it home for the weekend. The book is so long that at any point in time, we have hundreds of people reading it.

Readers of our book need to keep track of where they are up to in our book, so they keep track of their location by putting a bookmark in the book. Some readers read very slow, leaving their bookmark close to the beginning. Other readers give up halfway, leaving theirs in the middle and never coming back to it.

To make matters even worse, we are adding pages to this book every day. Nobody can actually finish this book.

Eventually our book fills up with bookmarks, until finally one day it is too heavy to carry and nobody can read it any more.

A very clever person then decided that readers should not be allowed to place bookmarks inside the book, and must instead write down the page they are up to on their diary.

This is the design of Apache Kafka, and it is a very resilient design. Readers are often not responsible citizens and often will not clean up after themselves, and the book may be the log of all the important events that happen in our company.

HOW TO : Search which package contains a filename

If you are using a Linux system that uses yum for package management (like Fedora, Centos, RHEL), you can use the following command to find out which package contains a file. This is useful when you want to figure out which package to install. For example, dig (DNS utility) doesn’t come pre-installed on the system. And running “sudo yum install dig” doesn’t do anything.

sudo yum whatprovides '*/dig'

This returns

Loaded plugins: fastestmirror
 Loading mirror speeds from cached hostfile
 32:bind-utils-9.8.2-0.47.rc1.el6.x86_64 : Utilities for querying DNS name servers
 Repo : base
 Matched from:
 Filename : /usr/bin/dig

breaking down the command options

whatprovides : Is used to find out which package provides some feature or file. Just use a specific name or a file-glob-syntax wildcards to list the packages available or installed that provide that feature or file.

HOW TO : count lines in windows command line

Say you are using netstat to checl all established network connections on a windows machine (confirmed to work on windows 7+ and windows server 2008+) and want to find out how many connections you have, you can use

netstat -an | find "ESTABLISHED" | find /v /c ""

breaking down the command string

netstat -an : Uses netstat command to display all connections and listening ports (-a) and displays them in numerical form instead of resolving DNS or using common names (-n)

| : piping (passing) output of one command to the next one

find “ESTABLISHED” : Uses find command to filter out to just lines that contain the string “ESTABLISHED”‘

find /c /v “” : exclude blank lines (/v “”) and count the number of remaining lines (/c)

If you wanted to something similar in linux, you can use

netstat -an | grep "ESTABLISHED" | wc -l

HOW TO : Use grep to search for content at end of line

If you want to search for a pattern at the end of a line, you can use

tail -f logfile | grep -v "0$"

breaking down the commands

tail -f : standard tail command. Continuous output to console as the file grows (or until it ends)

grep -v : -v command forces grep to show content that doesn’t match pattern

0$ : This regex is specifically looking for a 0 at the end of the line, which is denoted by $.

HOW TO : Query varnishlogs for requests with 404 responses

varnishlog, one of the tools provided with varnish cache, uses VSL Query Expressions (https://www.varnish-cache.org/docs/trunk/reference/vsl-query.html) to provide some powerful insights into the requests and responses.

Here is a how you can use varnishlog to show all client requests that are ending up with a 404 response.

sudo varnishlog -g request -i ReqURL -q "BerespStatus != 200"

Technically, this particular query shows all client requests with a response other than 200.

Breaking down the commands

-g request : shows all entries related to the request

-i ReqURL : forces varnishlog to only display the Requesting URL

-q “BerespStatus != 200” : query filter to only match non 200 responses. Note that the query has to be enclosed in “”.