Performance engineering interview questions

Published on 2016-06-09 by John Collins. Video: YouTube|Rumble|Patreon Audio: Spotify|Amazon Music|Apple Podcast

Introduction

Following on from my previous articles on interview screening questions for a Java Engineer or a Security Engineer, in the last part of this series I will present a list of screening questions I use for interviewing a Performance Engineer candidate. Like the security role, the performance role has to include a full-stack view of the application, so a braod knowledge is required appart from just programming. The programming-related questions here focus on Java and the JVM, but most of the questions in this artilce are also applicable for non-Java applications.

Interview questions

Q1: What is the difference between performance and scalability?

Response flags: Often mixed up, perf eng should know better.

Q2: What are the two "directions" in which to scale?

Response flags: Horizontal vs. vertical, pros and cons of each.

Q3: What is a VPS?

Response flags: Should explain what this is, chance to display knowledge of VM providers (Xen, VMWare etc.).

Q4: When would you use a VPS versus a physical server?

Response flags: Should explain some of the shortcomings of virtual servers.

Q5: What is a load balancer? What main types are there?

Response flags: Should know this, types are software and hardware.

Q6: In a load balancing strategy, what is the difference between "round robin" and "sticky sessions"?

Response flags: Client is sent to same server each time (sticky), versus different server each time. Bonus points for example how central session store facilitates round robin, and how it's faster.

Q7: What must a load balancer maintain in order to track session storage?

Response flags: Table mapping client IP to upstream server.

Q8: What is a disadvantage of maintaining sticky sessions?

Response flags: IP table lookup is an overhead. Client is logged out in single upstream server that has their session goes offline.

Q9: What is the difference between a load balancer and a reverse proxy?

Response flags: Perf eng should be familiar with both, give and explanation of reverse proxy usage.

Q10: In a web architecture, where would you typically decrypt SSL traffic?

Response flags: In load balancer, ideally with dedicated h/w due to overhead. Traffic inside datacenter can be plain.

Q11: During testing if you needed to intercept traffic between a HTTP client and a server, what tools could you use? What if the traffic was encrypted?

Response flags: Wireshark, Fiddler, many other proxies. For SSL you would need a copy of the cert key from the server.

Q12: When a thread running on a CPU needs to access some data, describe the some of the physical locations where that data might be stored, starting with the closest to the furthest away.

Response flags: CPU L1/2/3 cache, RAM, local disc (SSD, then RAID), LAN (data cache, then DB, also potentially file server), internet host (file server, remote API...).

Q13: Why does physical proximity to data matter?

Response flags: It loads faster!

Q14: In a web application, what is a CDN? What are the benefits?

Response flags: Network proximity, bonus points for mentioning higher probability of warm browser cache for widely-used CDNs like Google.

Q15: What are the two main types of data caches?

Response flags: Private vs. shared, explains the merits of both.

Q16: What is the classic problem involved with using caches?

Response flags: Cache invalidation, discuss approaches to this.

Q17: What is TTL of a cache entry? When should you use this?

Response flags: Time To Live, and it depends! Give examples.

Q18: Between a web browser and a web server responding with some data, how many layers of caching might exist?

Response flags: Browser cache, proxy cache (e.g. Squid), file cache (e.g. precompiled page template), shared data cache (e.g. Memcache, Redis), database query cache...

Q19: Rather than requesting the same full response each time, how would a HTTP client check to see if the previous response for the same request had updated on the HTTP server?

Response flags: Discuss HTTP caching headers like etags (md5 hash key exchange), or if-modified-since/last-modified (UNIX timestamps), mention bodyless 304 responses (not modified).

Q20: In relational databases, what are common "performance killer" queries?

Response flags: Too many joins, nested/sub-queries, querying on columns with no indexes, lack of partitions.

Q21: What can you do to improve query performance?

Response flags: See previous question, plus look at prepared statements and potentially stored procedures.

Q22: In performance testing, what is the difference between load, stress, and soak testing?

Response flags: Load - testing to a specified expected amount. Stress - burst traffic beyond expected amount. Soak - apply load testing for a prolonged period of time.

Q23: When conducting a Java code review, what are the common performance anti-patterns that you look out for?

Response flags: Should expect a broad list here.

Q24: What is the Java heap?

Response flags: The heap stores all of the objects created by your Java program, should mention GC.

Q25: When debugging a Java app, how would you identify major and minor garbage collection?

Response flags: In the log - minor collection prints "GC" if garbage collection logging is enable, full is "Full GC".

Q26: What is Perm Gen space in the Java memory heap?

Response flags: Used to store class meta data.

Q27: In Java 8 the Perm Gen space was replace with what?

Response flags: Metaspace.

Q28: Explain the -Xmx and -Xms paramaters of the JVM?

Response flags: JVM argument -Xmx defines the maximum heap size. The arg -Xms defines the initial heap size.

Q29: When does an Object becomes eligible for garbage collection in Java?

Response flags: No more active references to the object, or not reachable by any live thread.

Q30: If a production web app is not responding, how can you figure out what is wrong?

Response flags: Broad question to allow candidate to display investigation process: what logs to review (load balancers, app servers, DB slow query), what monitoring do we have (disc I/O, memory, paging to disc, CPU load), is there a spike in traffic due to advertising campaign, or DOS attack etc.?