秋来冬风的博客

Optimize http2 Web Server | Go standard library and runtime modifiable source code

目录

Software translation

This is the third blog post in my My High-Throughput HTTP/2 Server Optimization Experience Sharing Series.

Background

When optimizing my HTTP/2 server written in Go, I tried modifying Go’s source code to optimize the standard library and runtime. This article shares some parts of the source code that I found could be modified.

The corresponding version is the Go 1.26 development branch.

net/http Package

Unless otherwise stated, the code to be modified is in h2_bundle.go.

Increase Buffer Size

Adjust the constant http2bufWriterPoolBufferSize, which was originally 4KB.

Adjust DATA Frame Size

Adjust the constant http2handlerChunkWriteSize, which was originally 4KB.

Reduce Time String Allocation

In the http2responseWriterState.writeChunk method, each call dynamically allocates an RFC 1123 time string on the heap. You can optimize it to fetch the time string once per second.

Avoid Heap Allocation of Response Code Strings

In the http2httpCodeString function, except for response codes 200 and 404, other response codes cause heap allocation of a string every time, which can be optimized.

Adjust Content-Type Sniffing Rules

The code to be modified is in sniff.go.

It is the sniffSignatures variable, which defines the rules for Content-Type sniffing. You can make targeted adjustments to optimize it.


runtime Package

Increase Maximum Stack Cache Size per P

In the malloc.go file, the constant _StackCacheSize defines the maximum cache amount per P for stacks of specific sizes (e.g., 2KB, 4KB), with a default value of 32KB.

Adjust GC Target CPU Utilization Rate

In the mgcpacer.go file, the constant gcGoalUtilization defines the target CPU utilization rate of GC, with a default value of 0.25 (25%).

Note that setting this value too low may have the opposite effect, resulting in a significant exceedance of the target. For example, setting it to 1% may lead to an actual usage of over 10%.

Assist marking means: If GC is in the concurrent marking phase and finds that the progress of GC lags behind the speed of memory allocation, user goroutines will assist GC in performing part of the marking work when allocating memory.

In the mgcpacer.go file, the constant gcCreditSlack is the threshold of locally accumulated scan work credit. Beyond this threshold, the global gcController.heapScanWork and bgScanCredit will be updated.

  • A smaller value makes the assist ratio more accurate, and the assist thread is more likely to successfully “steal” background credit.

  • A larger value can reduce memory contention (due to reduced frequency of global updates).

gcAssistTimeSlack is the threshold of accumulated “assist marking time” on P (processor). Beyond this threshold, the global gcController.assistTime will be updated. It controls the update frequency of assist time statistics.

Adjust Minimum Heap Size

In the mgcpacer.go file, the constant defaultHeapMinimum defines the minimum threshold of heap memory when GOGC=100, with a default value of 4MB to avoid frequent GC for small heaps.

Adjust Minimum Stack Size

In the stack.go file, the constant stackMin controls the initial stack size when creating a goroutine, and it is also the minimum stack size.

Tags: