[Loki] 인덱스 빌더 크기 추정 최적화: 반복 계산 제거로 97% 성능 개선

2026년 1월 28일수정: 2026년 1월 28일

PR 링크: grafana/loki#20595 상태: Merged | 변경: +100 / -35

들어가며

Loki의 인덱스 빌더는 로그 라인이 추가될 때마다 현재 객체의 크기를 추정합니다. 문제는 이 추정이 매번 모든 테넌트의 빌더를 순회하며 합산하는 방식이었다는 점입니다. 테넌트가 많을수록 Append 한 번에 드는 비용이 선형으로 증가하여, 프로파일링 결과 대부분의 시간을 크기 계산에 소비하고 있었습니다. 이 PR은 증분 방식으로 변경하여 벤치마크 기준 97.57%의 실행 시간 감소를 달성했습니다.

핵심 코드 분석

기존: 모든 테넌트를 매번 순회

Before:

func (b *Builder) estimatedSize() int {
    var size int
    for _, tenantStreams := range b.streams {
        size += tenantStreams.EstimatedSize()
    }
    for _, tenantPointers := range b.pointers {
        size += tenantPointers.EstimatedSize()
    }
    for _, tenantIndexPointers := range b.indexPointers {
        size += tenantIndexPointers.EstimatedSize()
    }
    size += b.builder.Bytes()
    b.metrics.sizeEstimate.Set(float64(size))
    return size
}

변경: 증분 추적 필드 도입

After:

type Builder struct {
    // ...기존 필드...
    unflushedSizeEstimate int // 증분 추적 필드 추가
}

func (b *Builder) estimatedSize() int {
    var size int
    size += b.unflushedSizeEstimate
    size += b.builder.Bytes()
    b.metrics.sizeEstimate.Set(float64(size))
    return size
}

각 Append 메서드에서 변경 전후의 크기 차이만 누적합니다:

func (b *Builder) ObserveLogLine(...) error {
    tenantPointers := b.getPointersBuilderForTenant(tenantID)
    preAppendSizeEstimate := tenantPointers.EstimatedSize()

    tenantPointers.ObserveStream(path, section, streamIDInObject, streamIDInIndex, ts, uncompressedSize)

    postAppendSizeEstimate := tenantPointers.EstimatedSize()
    b.unflushedSizeEstimate += postAppendSizeEstimate - preAppendSizeEstimate
    // ...
}

왜 이게 좋은가

시간 복잡도 개선: O(테넌트 수)에서 O(1)로 크기 추정 비용이 감소합니다.
벤치마크 결과: 실행 시간 9233us에서 224us로, 메모리 할당 77KB에서 2KB로, allocs 40에서 1로 각각 97% 이상 감소했습니다.
정확성 유지: 각 Append에서 해당 테넌트의 크기 변화만 추적하므로 전체 합산 결과와 동일합니다.
리팩터링 보너스: 테넌트 빌더 조회 로직을 getStreamsBuilderForTenant, getPointersBuilderForTenant 등의 헬퍼 메서드로 추출하여 코드 중복을 제거했습니다.

참고 자료

PR Analysis 의 다른글

이전글 [pydantic-ai] DBOSAgent에서 병렬 도구 실행 지원 및 실행 모드 API 추가
현재글 : [Loki] 인덱스 빌더 크기 추정 최적화: 반복 계산 제거로 97% 성능 개선
다음글 [triton] NVIDIA TMA im2col 모드 드라이버 지원