본문으로 건너뛰기

#Vision-Language Model

31개의 포스트

[논문리뷰] MinerU2.5-Pro: Pushing the Limits of Data-Centric Document Parsing at Scale

댓글 수 로딩 중

[논문리뷰] CHURRO: Making History Readable with an Open-Weight Large Vision-Language Model for High-Accuracy, Low-Cost Historical Text Recognition

댓글 수 로딩 중

[논문리뷰] MedVista3D: Vision-Language Modeling for Reducing Diagnostic Errors in 3D CT Disease Detection, Understanding and Reporting

댓글 수 로딩 중