前言:

最近有需求处理docx文件,并讲内容显示到页面,对world进行在线的阅读,这样我这里就使用flask+Document对docx文件进行处理并显示,下面直接上代码:

Document处理:

首先下载Document的库文件,先直接安装最新版的python-docx,如果不行则换成1.1.0版本:

pip install python-docx

pip install python-docx==1.1.0

处理docx代码如下:

def ReadVADocx(ProjectName,DocxName):

docxfilepath = vaReportDir + "\\" + ProjectName + "\\" + DocxName

paragraphs = ReadDocx(docxfilepath)

return paragraphs

def ReadDocx(docxfilepath):

doc = Document(docxfilepath)

paragraphs = list()

pattern = re.compile('rId\d+')

for graph in doc.paragraphs:

level = graph.style.name.split(' ')[-1]

if level == "Normal":

level = None

elif level == "Preformatted":

level = None

paragraph = {

'text': graph.text,

'level': level,

'images': ""

}

paragraphs.append(paragraph)

for run in graph.runs:

if run.text == '':

contentID = pattern.search(run.element.xml)

if contentID:

contentID = contentID.group(0)

try:

contentType = doc.part.related_parts[contentID].content_type

except KeyError as e:

print(e)

continue

if not contentType.startswith('image'):

continue

imgData = doc.part.related_parts[contentID].blob

image_base64 = base64.b64encode(imgData).decode('utf-8')

paragraph = {

'text': run.text,

'level': run.style.name.split(' ')[-1] if run.style.name.startswith('Heading') else None,

'images': image_base64

}

paragraphs.append(paragraph)

上述代码会对docx文件进行遍历,并将对应的内容和等级放入数组中

下面是调用代码:

@app.route('/ViewVADocx', methods=['GET'])

def ViewVADocx():

try:

DocxName = request.args.get('docx')

ProjectName = request.args.get('name')

paragraphs = engine.ReadVADocx(ProjectName,DocxName)

return render_template("viewdocx.html", n_getname=ProjectName, n_user=user,paragraphs=paragraphs)

except Exception as e:

return render_template('error-500.html')

html编写: 

然后就是需要讲对应的内容在页面进行展示,下面列出html代码:

{% extends "mould.html" %}

{% block head %}

{% endblock %}

{% block body %}

↑回到顶部↑

{{ n_getname }}:扫描节点线

快速导航:

{% for paragraph in paragraphs %}

{% if paragraph.level == "1" %}

{{ paragraph.text }}

{% elif paragraph.level == "2" %}

{{ paragraph.text }}

{% endif %}

{% endfor %}

{% for paragraph in paragraphs %}

{% if paragraph.level %}

{% if paragraph.level == "Title" %}

{% elif paragraph.level == "1" %}

{{ paragraph.text }}

{% else %}

{{ paragraph.text }}

{% endif %}

{% else %}

{% if paragraph.images %}

后端 python3处理docx并flask显示  第1张

{% else %}

{{ paragraph.text }}

{% endif %}

{% endif %}

{% endfor %}

{% endblock %}

{% block list %}

{% endblock %}

其中添加了样式和回到顶部等小功能,方便浏览,最后的使用效果如下:

 

后记:

代码只做了docx文件的内容展示,包括文字和图片,并对等级进行了划分,没有对docx的修改功能,感兴趣的可以自己研究下 

 

 

好文阅读

评论可见,请评论后查看内容,谢谢!!!
 您阅读本篇文章共花了: