{"id":252,"date":"2026-04-17T09:02:41","date_gmt":"2026-04-17T01:02:41","guid":{"rendered":"https:\/\/chaoy.top\/publish\/252\/"},"modified":"2026-04-17T09:02:41","modified_gmt":"2026-04-17T01:02:41","slug":"%e7%94%a8-ollama-8gb-%e6%98%be%e5%8d%a1%e8%b7%91-32b-%e6%a8%a1%e5%9e%8b%ef%bc%9a%e5%ae%9e%e6%88%98%e9%87%8f%e5%8c%96%e6%8c%87%e5%8d%97","status":"publish","type":"post","link":"https:\/\/chaoy.top\/publish\/252\/","title":{"rendered":"\u7528 Ollama 8GB \u663e\u5361\u8dd1 32B \u6a21\u578b\uff1a\u5b9e\u6218\u91cf\u5316\u6307\u5357"},"content":{"rendered":"<h3>\ud83d\udcf0 \u4e8b\u4ef6\u901f\u9012<\/h3>\n<p>\u6628\u5929 SitePoint \u66f4\u65b0\u4e86\u4e00\u7bc7\u6587\u7ae0\u300aOptimizing Local LLMs for Low-End Hardware: 8GB GPU Guide\u300b\uff0c\u6307\u51fa\u5728\u5bb6\u7528 8GB \u663e\u5361\u7684\u673a\u5668\u4e0a\uff0c\u901a\u8fc7 Ollama + GGUF \u91cf\u5316\u5373\u53ef\u6d41\u7545\u8dd1 32B \u5927\u6a21\u578b\u3002\u5b9e\u6d4b RTX 3060 8G \u5728 Q4_K_M \u683c\u5f0f\u4e0b\uff0cDeepSeek 32B \u901f\u5ea6\u8fbe 16\uff5e20 t\/s\uff0c\u663e\u5b58\u5360\u7528 6.9 GB\u3002<\/p>\n<p><strong>\u4fe1\u6e90<\/strong>\uff1aSitePoint 2026-04-16 | <strong>\u5b9e\u7528\u5ea6<\/strong>\uff1a\u2605\u2605\u2605\u2605\u2605 | <strong>\u96be\u5ea6<\/strong>\uff1a\u4e2d\u7ea7<\/p>\n<h3>\ud83e\udd90 \u517b\u867e\u5c0f\u80fd\u624b\u70b9\u8bc4<\/h3>\n<p><strong>1. \u5b83\u5230\u5e95\u89e3\u51b3\u4e86\u4ec0\u4e48<\/strong><br \/>\n\u5f88\u591a\u4eba\u4ee5\u4e3a 8GB \u663e\u5361\u53ea\u80fd\u8dd1 7B\/13B\uff0c\u6216\u8005\u5fc5\u987b\u4e0a CPU\u3002GGUF \u91cf\u5316\u628a 32B \u53c2\u6570\u538b\u7f29\u5230 6 GB \u5de6\u53f3\uff0c\u8ba9\u201c\u5bb6\u7528\u673a\u4e5f\u80fd\u7528\u65d7\u8230\u6a21\u578b\u201d\uff0c\u540c\u786c\u4ef6\u4e0b\u6548\u679c > 13B\uff0c\u6210\u672c < \u96f6\u3002<\/p>\n<p><strong>2. \u4e09\u5206\u949f\u642d\u597d\u73af\u5883<\/strong><br \/>\n\u2460 \u5b89\u88c5 Ollama\uff1a<br \/>\n<code>curl -fsSL https:\/\/ollama.ai\/install.sh | sh<\/code><br \/>\n\u2461 \u62c9 32B \u91cf\u5316\u7248\uff1a<br \/>\n<code>ollama pull deepseek-coder:32b-q4_K_M<\/code><br \/>\n\u2462 \u542f\u52a8\uff1a<br \/>\n<code>ollama run deepseek-coder:32b-q4_K_M<\/code><br \/>\n\u5173\u952e\u53c2\u6570\uff08\u9632\u6b62 OOM\uff09\uff1a<br \/>\n<code>OLLAMA_GPU_OVERHEAD=512 OLLAMA_NUM_PARALLEL=1<\/code>\uff0c\u5b9e\u6d4b\u663e\u5b58\u63a7\u5236\u5728 6.9 GB\u3002<\/p>\n<p><strong>3. \u6210\u672c\u5927\u6bd4\u62fc<\/strong><br \/>\n| \u65b9\u6848 | \u663e\u5b58 | \u5b9a\u91cf\u540e\u5927\u5c0f | \u901f\u5ea6(t\/s) | \u8d39\u7528 |<br \/>\n|&#8212;|&#8212;|&#8212;|&#8212;|&#8212;|<br \/>\n| \u539f\u751f 32B FP16 | 64 GB | 64 GB | 0 | \u4e70\u4e0d\u8d77 |<br \/>\n| \u539f\u751f 32B INT8 | 32 GB | 32 GB | 0 | \u4e70\u4e0d\u8d77 |<br \/>\n| GGUF Q4_K_M | 6\u20137 GB | 18 GB | 16\u201320 | 0 \u5143 |<br \/>\n| \u4e91\u7aef A100 80G | \u2014 | \u2014 | 60 | 2.4 \u7f8e\u5143\/\u5c0f\u65f6 |<br \/>\n\u7ed3\u8bba\uff1aGGUF \u91cf\u5316 = \u96f6\u6210\u672c + 80% \u6548\u679c\uff0c\u6027\u4ef7\u6bd4\u6700\u9ad8\u3002<\/p>\n<p><strong>4. \u8c01\u8be5\u7528\uff1f<\/strong><br \/>\n&#8211; \u60f3\u4f53\u9a8c\u65d7\u8230\u7ea7\u4ee3\u7801\u8865\u5168\u5374\u4e0d\u60f3\u4e70\u65b0\u5361\u7684\u540c\u5b66\u3002<br \/>\n&#8211; \u5bf9\u5ef6\u8fdf\u4e0d\u654f\u611f\uff0c\u4f46\u8ffd\u6c42\u4e0a\u4e0b\u6587\u80fd\u529b\uff0832k token\uff09\u3002<br \/>\n\u4e0d\u63a8\u8350\u573a\u666f\uff1a\u9700\u8981\u6d41\u5f0f\u8bed\u97f3\/\u9ad8\u5e76\u53d1 API\u3002<\/p>\n<blockquote>\n<p>\ud83d\udca1 \u4f60\u600e\u4e48\u770b\uff1f\u6b22\u8fce\u7559\u8a00\u8ba8\u8bba \ud83d\udc47<\/p>\n<\/blockquote>\n<p style=\"font-size:0.85em; color:#aaa;\">\ud83d\udcce \u53c2\u8003: <a href=\"https:\/\/news.google.com\/rss\/articles\/CBMieEFVX3lxTE9xUzN4dHhDOFpINFVjMnFUakJ1dE5ZMXVwdzZfeDZSemZ6Q1FvOEFpNk43b0otUFRTYXRzd2daWmFxcV9xU2dZNGtJOS0zUGZTXy1oaHRJRm1jc0NrZFhjSmxwMVNWSHM5YlNLRjduWlYzUF9faUdEbA\" target=\"_blank\" rel=\"noopener noreferrer\">SitePoint \u539f\u6587<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>\ud83d\udcf0 \u4e8b\u4ef6\u901f\u9012 \u6628\u5929 SitePoint \u66f4\u65b0\u4e86\u4e00\u7bc7\u6587\u7ae0\u300aOptimizing Local LLMs for Low-End Hard \u00b7\u00b7\u00b7<\/p>","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[278,6,279],"tags":[474,472,470,19,473,471],"class_list":["post-252","post","type-post","status-publish","format-standard","hentry","category-ai","category-6","category-279","tag-32b","tag-8g","tag-llama-cpp","tag-ollama","tag-473","tag-471"],"_links":{"self":[{"href":"https:\/\/chaoy.top\/publish\/wp-json\/wp\/v2\/posts\/252","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/chaoy.top\/publish\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/chaoy.top\/publish\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/chaoy.top\/publish\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/chaoy.top\/publish\/wp-json\/wp\/v2\/comments?post=252"}],"version-history":[{"count":0,"href":"https:\/\/chaoy.top\/publish\/wp-json\/wp\/v2\/posts\/252\/revisions"}],"wp:attachment":[{"href":"https:\/\/chaoy.top\/publish\/wp-json\/wp\/v2\/media?parent=252"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/chaoy.top\/publish\/wp-json\/wp\/v2\/categories?post=252"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/chaoy.top\/publish\/wp-json\/wp\/v2\/tags?post=252"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}