网站首页 > 后端 > 正文

开发语言 python 获取图片中的中文的几种办法

编程课程免费全套后端 2024-04-13 15 0

在Python中，获取图片中的中文文本通常需要使用光学字符识别（OCR）技术.

1.使用http请求库获取,分别主流有2种以下库

使用百度OCR API：百度提供了OCR API服务，可以通过API调用来识别图片中的文本，包括中文。你需要注册百度开发者账号，获取API密钥，然后使用Python中的HTTP请求库发送图片并接收识别结果

使用微软Azure OCR服务：微软Azure也提供了OCR服务，可以用来提取中文文本。与百度API类似，你需要注册Azure账号，创建一个OCR服务，然后使用Python中的HTTP请求库发送请求并获取结果。

2.使用第三方库，下面推荐4种第三方库及源码

Tesseract OCR库：

pip install pytesseract

from PIL import Image

import pytesseract

# 打开图像

image = Image.open('your_image.png')

# 使用Tesseract进行文本提取

text = pytesseract.image_to_string(image, lang='chi_sim')

# 输出提取的中文文本

print(text)

EasyOCR库：

pip install easyocr

import easyocr

# 创建EasyOCR Reader

reader = easyocr.Reader(['ch_sim'])

# 打开图像

image = 'your_image.png'

# 使用EasyOCR进行文本提取

results = reader.readtext(image)

# 输出提取的中文文本

for (bbox, text, prob) in results:

print(text)

PyOCR库：

pip install pyocr

import pyocr

import pyocr.builders

from PIL import Image

# 获取Tesseract OCR工具

tools = pyocr.get_available_tools()

tool = tools[0]

# 打开图像

image = Image.open('your_image.png')

# 使用PyOCR进行文本提取

text = tool.image_to_string(

image,

lang='chi_sim',

builder=pyocr.builders.TextBuilder()

)

# 输出提取的中文文本

print(text)

Google Cloud Vision API库：

pip install google-cloud-vision

from google.cloud import vision_v1p3beta1 as vision

from google.oauth2 import service_account

# 设置认证凭据

credentials = service_account.Credentials.from_service_account_file(

'your-service-account-key.json'

)

# 创建Vision API客户端

client = vision.ImageAnnotatorClient(credentials=credentials)

# 打开图像

with open('your_image.png', 'rb') as image_file:

content = image_file.read()

# 创建图像对象

image = vision.Image(content=content)

# 使用Vision API进行文本提取

response = client.text_detection(image=image)

# 输出提取的中文文本

for text in response.text_annotations:

print(text.description)

请注意，对于Google Cloud Vision API，你需要替换 'your-service-account-key.json' 为你自己的服务账户密钥文件路径。确保在使用这些示例代码之前，你已经正确配置了相应的库和服务。

文章来源

评论可见，请评论后查看内容，谢谢！！！

您阅读本篇文章共花了：

python 开发语言

本文由用户于 2024-04-13 发布在金钥匙，如有疑问，请联系我们。
本文链接：https://www.51969.com/post/17823110.html

金钥匙

开发语言 python 获取图片中的中文的几种办法

前端 Python web 框架对比：Flask vs Django

python 后端 websocket pyinstall打包Flask成exe文件遇到的3个问题并解决

发表评论取消回复

金钥匙

开发语言 python 获取图片中的中文的几种办法

前端 Python web 框架对比：Flask vs Django

python 后端 websocket pyinstall打包Flask成exe文件遇到的3个问题并解决

相关文章

发表评论取消回复