Qwen3-VL 本地视觉演示¶
这个 notebook 演示怎样调用本地 Ollama 中的 qwen3.5:0.8b 模型,让它读取一张图片并给出回答。
第 1 步:准备图片路径¶
默认示例图放在 examples/example.png。你也可以换成自己的图片。
In [ ]:
from pathlib import Path
import base64
import json
import mimetypes
import urllib.request
from IPython.display import Image, display
import argparse
import os
import sys
import urllib.error
image_path = Path('examples/example.png')
image_path.exists()
In [ ]:
display(Image(filename=str(image_path)))
第 2 步:把图片转成 base64¶
In [ ]:
mime_type, _ = mimetypes.guess_type(str(image_path))
if mime_type is None:
mime_type = 'image/png'
image_base64 = base64.b64encode(image_path.read_bytes()).decode('utf-8')
image_base64[:80]
第 3 步:构造请求体¶
这里的关键点是:图片和文字 prompt 一起进入 messages。
In [ ]:
payload = {
'model': 'qwen3.5:0.8b',
'stream': False,
'messages': [
{
'role': 'user',
'content': '请描述这张图片的主要内容,并指出你最有把握的三个视觉细节。',
'images': [image_base64],
}
],
}
payload
第 4 步:请求本地 Ollama¶
In [ ]:
request = urllib.request.Request(
'http://localhost:11434/api/chat',
data=json.dumps(payload).encode('utf-8'),
headers={'Content-Type': 'application/json'},
method='POST',
)
# Force direct local access. Some student environments export socks/http proxies,
# which can break requests to localhost.
for key in [
"http_proxy",
"https_proxy",
"HTTP_PROXY",
"HTTPS_PROXY",
"all_proxy",
"ALL_PROXY",
]:
os.environ.pop(key, None)
opener = urllib.request.build_opener(urllib.request.ProxyHandler({}))
try:
with opener.open(request) as response:
result = json.loads(response.read().decode("utf-8"))
except urllib.error.HTTPError as exc:
print(exc.read().decode("utf-8", errors="ignore"), file=sys.stderr)
except urllib.error.URLError as exc:
print(f"Failed to reach Ollama: {exc}", file=sys.stderr)
message = result.get("message", {})
print(message.get("content", "").strip())
第 5 步:查看模型回答¶
In [ ]:
result['message']['content']
第 6 步:改一个更有趣的 prompt¶
你可以尝试让模型解释图片中的幽默、情绪或场景。
In [ ]:
payload['messages'][0]['content'] = '请解释这张图的幽默点,并说明你的判断依据。'
request = urllib.request.Request(
'http://localhost:11434/api/chat',
data=json.dumps(payload).encode('utf-8'),
headers={'Content-Type': 'application/json'},
method='POST',
)
for key in [
'http_proxy',
'https_proxy',
'HTTP_PROXY',
'HTTPS_PROXY',
'all_proxy',
'ALL_PROXY',
]:
os.environ.pop(key, None)
opener = urllib.request.build_opener(urllib.request.ProxyHandler({}))
with opener.open(request) as response:
result = json.loads(response.read().decode('utf-8'))
result['message']['content']