#WebGPU

4개의 포스트

[onnxruntime] WebGPU FlashAttention 최적화: 커널 퓨전과 가변 시퀀스 길이 지원으로 성능 극대화

WebGPU FlashAttention의 커널 퓨전과 가변 시퀀스 길이 지원을 통한 성능 개선 분석

#WebGPU #FlashAttention #ONNX Runtime #최적화 #성능 개선 #AI 가속

2026년 6월 11일

[onnxruntime] WebGPU 성능 최적화: Graph Capture 재사용을 위한 Session-level Buffer Pool 도입

ONNX Runtime WebGPU EP에서 그래프 캡처 시 발생하는 버퍼 재할당 오버헤드를 줄이기 위한 세션 레벨 버퍼 풀링 기법 분석

#WebGPU #ONNXRuntime #Performance #GraphCapture #GenAI

2026년 6월 10일

[onnxruntime] Apple M4 Max를 위한 FlashAttention 최적화: 20배 성능 향상 분석

WebGPU 기반 FlashAttention을 Apple 실리콘 환경에 맞춰 튜닝하여 최대 20배의 성능 향상을 달성한 기술적 접근을 분석합니다.

#ONNXRuntime #WebGPU #FlashAttention #AppleSilicon #PerformanceOptimization

2026년 5월 14일

[논문리뷰] Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting Platform

본 논문은 기존 3D Gaussian Splatting(3DGS) 뷰어의 한계인 파편화, 무거움, 레거시 파이프라인 제약으로 인한 높은 배포 마찰과 동적 콘텐츠 및 생성 모델 지원 부족 문제를 해결하고자 합니다.

#Review #Neural Rendering #3D Gaussian Splatting #WebGPU #ONNX Inference #World Models #Real-time Rendering #Browser-based #Dynamic Scenes

2025년 12월 9일