llamacpp 跑100K起步能达到16t/s GTX1060G 6G
文章摘要:该批处理脚本用于配置和启动一个名为Qwen3.6-35B-A3B-Uncensored的AI模型服务器,基于llama.cpp框架。脚本包含以下内容:1. 设置终端颜色变量;2. 定义模型文件、服务器程序和目录路径;3. 进行必要的文件检查;4. 显示模型信息和配置参数(包括CUDA 12.8、999层GPU加速、100K上下文长度等);5. 启动服务器命令,包含详细的运行参数(端口80
@echo off
chcp 437 > nul
title Qwen3.6-35B-A3B-Uncensored
:: ========== Colors ==========
set "ESC=["
set "RED=%ESC%91m"
set "GREEN=%ESC%92m"
set "YELLOW=%ESC%93m"
set "BLUE=%ESC%94m"
set "PURPLE=%ESC%95m"
set "CYAN=%ESC%96m"
set "WHITE=%ESC%97m"
set "RESET=%ESC%0m"
:: ========== Paths ==========
set SERVER=C:\Users\AI\Llama Server\llama.cpp\bin\llama-server.exe
set MODEL=C:\Users\AI\Llama Server\llama.cpp\models\Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive-Q4_K_M.gguf
set MMPROJ=C:\Users\AI\Llama Server\llama.cpp\models\mmproj-Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive-f16.gguf
set SLOTS=C:\Users\AI\Llama Server\llama.cpp\data\slots
:: ========== File Checks ==========
if not exist "%MODEL%" (echo %RED%[X] Model file missing%RESET% & pause & exit /b 1)
if not exist "%MMPROJ%" (echo %RED%[X] mmproj file missing%RESET% & pause & exit /b 1)
if not exist "%SERVER%" (echo %RED%[X] Server binary missing%RESET% & pause & exit /b 1)
if not exist "%SLOTS%" mkdir "%SLOTS%" 2>nul
cls
echo.
echo %PURPLE% +---------------------------------------------------+%RESET%
echo %PURPLE% ^|%RESET% %GREEN%Qwen3.6-35B-A3B-Uncensored%RESET% %PURPLE%^|%RESET%
echo %PURPLE% ^|%RESET% %CYAN%HauhauCS Aggressive ^| MoE ^| Q4_K_M%RESET% %PURPLE%^|%RESET%
echo %PURPLE% +---------------------------------------------------+%RESET%
echo.
echo %CYAN% CUDA 12.8 self-compiled -- GPU: 999L -- CPU MoE: 999 -- Context: 100K -- np: 2%RESET%
echo %CYAN% KV unified -- reasoning: off -- rope: 2.5 -- flash-attn%RESET%
echo %YELLOW% Port: 8080 -- Host: 0.0.0.0 -- Host: 127.0.0.1 %RESET%
echo.
echo %WHITE% Model: Q4_K_M (~20 GB) + mmproj F16 (~860 MB)%RESET%
echo %WHITE% Path: C:\Users\AI\Llama Server\llama.cpp\models\%RESET%
echo.
"%SERVER%" ^
-m "%MODEL%" ^
--mmproj "%MMPROJ%" ^
--host 0.0.0.0 --port 8080 ^
--n-gpu-layers 999 ^
--n-cpu-moe 999 ^
-t 8 ^
--ctx-size 200000 ^
-np 2 ^
--batch-size 512 ^
--ubatch-size 256 ^
--cache-type-k q4_0 ^
--cache-type-v q4_0 ^
--cache-ram 5120 ^
--rope-scale 2.5 ^
--reasoning on ^
--no-mmap ^
--slot-save-path "%SLOTS%" ^
--no-warmup --prio 2 ^
--temp 0.80 --top-k 100 --top-p 0.82 --min-p 0.12 --repeat-penalty 1.00 ^
--alias "Qwen3.6-35B-Uncensored" ^
--timeout 300 ^
--ui --metrics ^
--flash-attn auto
pause

opencode真的可用完整一套流畅,但是在openclaw里就会出现问题!
openEuler 是由开放原子开源基金会孵化的全场景开源操作系统项目,面向数字基础设施四大核心场景(服务器、云计算、边缘计算、嵌入式),全面支持 ARM、x86、RISC-V、loongArch、PowerPC、SW-64 等多样性计算架构
更多推荐

所有评论(0)