前言:
最近有需求处理docx文件,并讲内容显示到页面,对world进行在线的阅读,这样我这里就使用flask+Document对docx文件进行处理并显示,下面直接上代码:
Document处理:
首先下载Document的库文件,先直接安装最新版的python-docx,如果不行则换成1.1.0版本:
pip install python-docx
pip install python-docx==1.1.0
处理docx代码如下:
def ReadVADocx(ProjectName,DocxName):
docxfilepath = vaReportDir + "\\" + ProjectName + "\\" + DocxName
paragraphs = ReadDocx(docxfilepath)
return paragraphs
def ReadDocx(docxfilepath):
doc = Document(docxfilepath)
paragraphs = list()
pattern = re.compile('rId\d+')
for graph in doc.paragraphs:
level = graph.style.name.split(' ')[-1]
if level == "Normal":
level = None
elif level == "Preformatted":
level = None
paragraph = {
'text': graph.text,
'level': level,
'images': ""
}
paragraphs.append(paragraph)
for run in graph.runs:
if run.text == '':
contentID = pattern.search(run.element.xml)
if contentID:
contentID = contentID.group(0)
try:
contentType = doc.part.related_parts[contentID].content_type
except KeyError as e:
print(e)
continue
if not contentType.startswith('image'):
continue
imgData = doc.part.related_parts[contentID].blob
image_base64 = base64.b64encode(imgData).decode('utf-8')
paragraph = {
'text': run.text,
'level': run.style.name.split(' ')[-1] if run.style.name.startswith('Heading') else None,
'images': image_base64
}
paragraphs.append(paragraph)
上述代码会对docx文件进行遍历,并将对应的内容和等级放入数组中
下面是调用代码:
@app.route('/ViewVADocx', methods=['GET'])
def ViewVADocx():
try:
DocxName = request.args.get('docx')
ProjectName = request.args.get('name')
paragraphs = engine.ReadVADocx(ProjectName,DocxName)
return render_template("viewdocx.html", n_getname=ProjectName, n_user=user,paragraphs=paragraphs)
except Exception as e:
return render_template('error-500.html')
html编写:
然后就是需要讲对应的内容在页面进行展示,下面列出html代码:
{% extends "mould.html" %}
{% block head %}
{% endblock %}
{% block body %}
{{ n_getname }}:扫描节点线
快速导航:
{% for paragraph in paragraphs %}
{% if paragraph.level == "1" %}
{% elif paragraph.level == "2" %}
{% endif %}
{% endfor %}
{% for paragraph in paragraphs %}
{% if paragraph.level %}
{% if paragraph.level == "Title" %}
{% elif paragraph.level == "1" %}
{% else %}
{% endif %}
{% else %}
{% if paragraph.images %}
{% else %}
{{ paragraph.text }}
{% endif %}
{% endif %}
{% endfor %}
{% endblock %}
{% block list %}
.hover-link {
font-size: 20px;
}
.hover-link:hover {
color: red;
font-size: 30px;
}
.hover-link2 {
font-size: 15px;
}
.hover-link2:hover {
color: red;
font-size: 20px;
}
/* CSS 样式,用于定义悬浮框的外观 */
.floating-box {
position: fixed;
bottom: 20px;
right: 20px;
width: 80px;
height: 50px;
background-color: #ff9900;
color: #fff;
text-align: center;
line-height: 50px;
cursor: pointer;
}
// JavaScript 代码
var floatingBox = document.getElementById('floatingBox');
// 点击事件监听器
floatingBox.addEventListener('click', function() {
window.scrollTo({ top: 0, behavior: 'smooth' });
});
{% endblock %}
其中添加了样式和回到顶部等小功能,方便浏览,最后的使用效果如下:
后记:
代码只做了docx文件的内容展示,包括文字和图片,并对等级进行了划分,没有对docx的修改功能,感兴趣的可以自己研究下
好文阅读
发表评论