文章目录

与GPT提示的比较设置作为基准,我们将转录NPR播客片段转录遵循提示的风格在提示中传递名称以防止拼写错误GPT可以生成虚构的提示

OpenAI的音频转录API有一个可选参数称为

prompt。

提示旨在帮助拼接多个音频片段。通过通过提示提交之前片段的转录稿,Whisper模型可以利用上下文更好地理解语音并保持一致的写作风格。

然而,提示不需要是来自先前音频片段的真实转录稿。可以提交_虚构的_提示来引导模型使用特定的拼写或风格。

本笔记本介绍了两种使用虚构提示来引导模型输出的技术:

转录生成:GPT可以将指令转换为虚构的转录稿,供Whisper模拟。拼写指南:拼写指南可以告诉模型如何拼写人名、产品名、公司名等。

这些技术并不是特别可靠,但在某些情况下可能会有用。

与GPT提示的比较

提示Whisper与提示GPT不同。例如,如果您提交了一个尝试的指令,如“以Markdown格式格式化列表”,模型将不会遵守,因为它遵循提示的风格,而不是其中包含的任何指令。

此外,提示仅限于224个标记。如果提示超过224个标记,只有提示的最后224个标记将被考虑;所有之前的标记将被静默忽略。使用的分词器是multilingual Whisper tokenizer。

为了获得良好的结果,请设计能够展现您所期望的风格的示例。

设置

要开始,请执行以下操作:

导入OpenAI Python库(如果您没有它,您需要使用pip install openai进行安装)下载一些示例音频文件

# 导入所需的库

import openai # 用于调用OpenAI API

import urllib # 用于下载示例音频文件

# 设置下载路径

up_first_remote_filepath = "https://cdn.openai.com/API/examples/data/upfirstpodcastchunkthree.wav" # 第一个文件的远程路径

bbq_plans_remote_filepath = "https://cdn.openai.com/API/examples/data/bbq_plans.wav" # 第二个文件的远程路径

product_names_remote_filepath = "https://cdn.openai.com/API/examples/data/product_names.wav" # 第三个文件的远程路径

# 设置本地保存位置

up_first_filepath = "data/upfirstpodcastchunkthree.wav" # 第一个文件的本地保存位置

bbq_plans_filepath = "data/bbq_plans.wav" # 第二个文件的本地保存位置

product_names_filepath = "data/product_names.wav" # 第三个文件的本地保存位置

# 下载示例音频文件并保存到本地

urllib.request.urlretrieve(up_first_remote_filepath, up_first_filepath) # 下载第一个文件并保存到本地

urllib.request.urlretrieve(bbq_plans_remote_filepath, bbq_plans_filepath) # 下载第二个文件并保存到本地

urllib.request.urlretrieve(product_names_remote_filepath, product_names_filepath) # 下载第三个文件并保存到本地

('data/product_names.wav', )

作为基准,我们将转录NPR播客片段

这个例子的音频文件将是NPR播客Up First的一个片段。

让我们先得到我们的基准转录,然后引入提示。

# 定义一个包装函数,用于查看提示对转录结果的影响

def transcribe(audio_filepath, prompt: str) -> str:

"""给定一个提示,转录音频文件。"""

# 使用OpenAI的音频转录API,创建一个转录对象

transcript = openai.audio.transcriptions.create(

file=open(audio_filepath, "rb"),

model="whisper-1",

prompt=prompt,

)

# 返回转录结果的文本

return transcript.text

# 使用基准转录方法进行转录

# 参数up_first_filepath:待转录的文件路径

# 参数prompt:转录时的提示文本,默认为空字符串

transcribe(up_first_filepath, prompt="")

"I stick contacts in my eyes. Do you really? Yeah. That works okay? You don't have to, like, just kind of pain in the butt every day to do that? No, it is. It is. And I sometimes just kind of miss the eye. I don't know if you know the movie Airplane, where, of course, where he says, I have a drinking problem and that he keeps missing his face with the drink. That's me and the contact lens. Surely, you must know that I know the movie Airplane. I do. I do know that. Stop calling me Shirley. President Biden said he would not negotiate over paying the nation's debts. But he is meeting today with House Speaker Kevin McCarthy. Other leaders of Congress will also attend. So how much progress can they make? I'm E. Martinez with Steve Inskeep, and this is Up First from NPR News. Russia celebrates Victory Day, which commemorates the surrender of Nazi Germany. Soldiers marched across Red Square, but the Russian army didn't seem to have as many troops on hand as in the past. So what does this ritual say about the war Russia is fighting right now?"

转录遵循提示的风格

在未经提示的转录中,“总统拜登”被大写。然而,如果我们以小写的虚构提示“president biden”传入,Whisper会匹配风格并生成一个全小写的转录。

# 给定一个小写的提示语句 "president biden"

# 将其转录为大写字母开头的文件路径 up_first_filepath

# 使用函数 transcribe() 进行转录

# prompt 参数指定了要转录的提示语句为 "president biden"

transcribe(up_first_filepath, prompt="president biden")

"I stick contacts in my eyes. Do you really? Yeah. That works okay? You don't have to, like, just kind of pain in the butt every day to do that? No, it is. It is. And I sometimes just kind of miss the eye. I don't know if you know the movie Airplane? Yes. Of course. Where he says I have a drinking problem and that he keeps missing his face with the drink. That's me and the contact lens. Surely, you must know that I know the movie Airplane. I do. I do know that. Don't call me Shirley. Stop calling me Shirley. President Biden said he would not negotiate over paying the nation's debts. But he is meeting today with House Speaker Kevin McCarthy. Other leaders of Congress will also attend. So how much progress can they make? I'm E. Martinez with Steve Inskeep and this is Up First from NPR News. Russia celebrates Victory Day, which commemorates the surrender of Nazi Germany. Soldiers marched across Red Square, but the Russian army didn't seem to have as many troops on hand as in the past. So what does this ritual say about the war Russia is fighting right now?"

请注意,当提示较短时,Whisper 可能不太可靠地遵循它们的风格。

# 为了提高准确性,建议使用较长的提示语。

# 使用transcribe函数来转录up_first_filepath指定的音频文件。

# prompt参数指定了提示语为"president biden."。

transcribe(up_first_filepath, prompt="president biden.")

"I stick contacts in my eyes. Do you really? Yeah. That works okay? You don't have to, like, just kind of pain in the butt every day to do that? No, it is. It is. And I sometimes just kind of miss the eye. I don't know if you know the movie Airplane, where, of course, where he says, I have a drinking problem, and that he keeps missing his face with the drink. That's me and the contact lens. Surely, you must know that I know the movie Airplane. I do. I do know that. Stop calling me Shirley. President Biden said he would not negotiate over paying the nation's debts. But he is meeting today with House Speaker Kevin McCarthy. Other leaders of Congress will also attend. So how much progress can they make? I'm E. Martinez with Steve Inskeep, and this is Up First from NPR News. Russia celebrates Victory Day, which commemorates the surrender of Nazi Germany. Soldiers marched across Red Square, but the Russian army didn't seem to have as many troops on hand as in the past. So what does this ritual say about the war Russia is fighting right now?"

长的提示可能更可靠地引导Whisper。

# 调用音频文件的转录函数,并传入提示文本

transcription = transcribe(up_first_filepath, prompt="i have some advice for you. multiple sentences help establish a pattern. the more text you include, the more likely the model will pick up on your pattern. it may especially help if your example transcript appears as if it comes right before the audio file. in this case, that could mean mentioning the contacts i stick in my eyes.")

"i stick contacts in my eyes. do you really? yeah. that works okay? you don't have to, like, just kind of pain in the butt? no, it is. it is. and i sometimes just kind of miss the eye. i don't know if you know, um, the movie airplane? yes. of course. where he says i have a drinking problem. and that he keeps missing his face with the drink. that's me in the contact lens. surely, you must know that i know the movie airplane. i do. i do know that. don't call me surely. stop calling me surely. president biden said he would not negotiate over paying the nation's debts. but he is meeting today with house speaker kevin mccarthy. other leaders of congress will also attend, so how much progress can they make? i'm amy martinez with steve inskeep, and this is up first from npr news. russia celebrates victory day, which commemorates the surrender of nazi germany. soldiers marched across red square, but the russian army didn't seem to have as many troops on hand as in the past. so what does this ritual say about the war russia is fighting right now?"

Whisper也不太可能追随罕见或奇怪的风格。

transcribe(up_first_filepath, prompt="""Hi there and welcome to the show.

###

Today we are quite excited.

###

Let's jump right in.

###""")

"I stick contacts in my eyes. Do you really? Yeah. That works okay. You don't have to like, it's not a pain in the butt. It is. And I sometimes just kind of miss the eye. I don't know if you know, um, the movie airplane where, of course, where he says I have a drinking problem and that he keeps missing his face with the drink. That's me in the contact lens. Surely you must know that I know the movie airplane. Uh, I do. I do know that. Stop calling me Shirley. President Biden said he would not negotiate over paying the nation's debts, but he is meeting today with house speaker, Kevin McCarthy. Other leaders of Congress will also attend. So how much progress can they make? I mean, Martinez with Steve Inskeep, and this is up first from NPR news. Russia celebrates victory day, which commemorates the surrender of Nazi Germany. Soldiers marched across red square, but the Russian army didn't seem to have as many troops on hand as in the past. So what does this ritual say about the war? Russia is fighting right now."

在提示中传递名称以防止拼写错误

Whisper可能会错误地转录不常见的专有名词,例如产品名称、公司名称或人名。

我们将通过一个充满产品名称的示例音频文件进行说明。

# 调用transcribe函数,将产品名称文件路径和提示文本作为参数传入

transcribe(product_names_filepath, prompt="")

'Welcome to Quirk, Quid, Quill, Inc., where finance meets innovation. Explore diverse offerings, from the P3 Quattro, a unique investment portfolio quadrant, to the O3 Omni, a platform for intricate derivative trading strategies. Delve into unconventional bond markets with our B3 Bond X and experience non-standard equity trading with E3 Equity. Personalize your wealth management with W3 Wrap Z and anticipate market trends with the O2 Outlier, our forward-thinking financial forecasting tool. Explore venture capital world with U3 Unifund or move your money with the M3 Mover, our sophisticated monetary transfer module. At Quirk, Quid, Quill, Inc., we turn complex finance into creative solutions. Join us in redefining financial services.'

为了让Whisper使用我们偏爱的拼写,让我们将产品和公司名称作为提示传递,作为Whisper遵循的词汇表。

# 调用transcribe函数,将产品名称转录为文本文件

transcribe(product_names_filepath, prompt="QuirkQuid Quill Inc, P3-Quattro, O3-Omni, B3-BondX, E3-Equity, W3-WrapZ, O2-Outlier, U3-UniFund, M3-Mover")

'Welcome to QuirkQuid Quill Inc, where finance meets innovation. Explore diverse offerings, from the P3-Quattro, a unique investment portfolio quadrant, to the O3-Omni, a platform for intricate derivative trading strategies. Delve into unconventional bond markets with our B3-BondX and experience non-standard equity trading with E3-Equity. Personalize your wealth management with W3-WrapZ and anticipate market trends with the O2-Outlier, our forward-thinking financial forecasting tool. Explore venture capital world with U3-UniFund or move your money with the M3-Mover, our sophisticated monetary transfer module. At QuirkQuid Quill Inc, we turn complex finance into creative solutions. Join us in redefining financial services.'

现在,让我们切换到另一个专门为此演示制作的音频记录,主题是一个奇怪的烧烤。

首先,我们将使用Whisper建立我们的基线转录本。

# 导入了一个名为transcribe的函数

# 该函数的作用是将指定文件路径下的音频文件转换成文本

# bbq_plans_filepath是音频文件的路径

# prompt参数是一个空字符串,表示没有提示语

transcribe(bbq_plans_filepath, prompt="")

"Hello, my name is Preston Tuggle. I'm based in New York City. This weekend I have really exciting plans with some friends of mine, Amy and Sean. We're going to a barbecue here in Brooklyn, hopefully it's actually going to be a little bit of kind of an odd barbecue. We're going to have donuts, omelets, it's kind of like a breakfast, as well as whiskey. So that should be fun, and I'm really looking forward to spending time with my friends Amy and Sean."

尽管Whisper的转录是准确的,但它不得不猜测各种拼写。例如,它假设朋友的名字拼写为Amy和Sean,而不是Aimee和Shawn。让我们看看是否可以通过提示来指导拼写。

transcribe(bbq_plans_filepath, prompt="")

"Hello, my name is Preston Tuggle. I'm based in New York City. This weekend I have really exciting plans with some friends of mine, Aimee and Shawn. We're going to a barbecue here in Brooklyn. Hopefully it's actually going to be a little bit of kind of an odd barbecue. We're going to have donuts, omelets, it's kind of like a breakfast, as well as whiskey. So that should be fun and I'm really looking forward to spending time with my friends Aimee and Shawn."

成功!

让我们尝试一些拼写不明确的单词。

# 调用transcribe函数,将bbq_plans_filepath文件内容转录为文本

transcribed_text = transcribe(bbq_plans_filepath, prompt="Glossary: Aimee, Shawn, BBQ, Whisky, Doughnuts, Omelet")

"Hello, my name is Preston Tuggle. I'm based in New York City. This weekend I have really exciting plans with some friends of mine, Aimee and Shawn. We're going to a barbecue here in Brooklyn. Hopefully, it's actually going to be a little bit of an odd barbecue. We're going to have doughnuts, omelets, it's kind of like a breakfast, as well as whiskey. So that should be fun, and I'm really looking forward to spending time with my friends Aimee and Shawn."

# 定义bbq_plans_filepath为文件路径

# 调用transcribe函数,传入bbq_plans_filepath和prompt参数

# prompt参数为一个自然语言的提示,用于指示要转录的内容

transcribe(bbq_plans_filepath, prompt=""""Aimee and Shawn ate whisky, doughnuts, omelets at a BBQ.""")

"Hello, my name is Preston Tuggle. I'm based in New York City. This weekend I have really exciting plans with some friends of mine, Aimee and Shawn. We're going to a BBQ here in Brooklyn. Hopefully it's actually going to be a little bit of kind of an odd BBQ. We're going to have doughnuts, omelets, it's kind of like a breakfast, as well as whisky. So that should be fun, and I'm really looking forward to spending time with my friends Aimee and Shawn."

GPT可以生成虚构的提示

生成虚构提示的一个潜在工具是GPT。我们可以给GPT提供指令,并使用它生成长篇的虚构对话,用于提示Whisper。

# 定义一个函数用于GPT生成虚构提示

def fictitious_prompt_from_instruction(instruction: str) -> str:

"""给定一个指令,生成一个虚构的提示。"""

# 使用openai.chat.completions.create函数生成回复

response = openai.chat.completions.create(

model="gpt-3.5-turbo-0613", # 使用的模型

temperature=0, # 温度参数,控制生成文本的多样性

messages=[

{

"role": "system",

"content": "You are a transcript generator. Your task is to create one long paragraph of a fictional conversation. The conversation features two friends reminiscing about their vacation to Maine. Never diarize speakers or add quotation marks; instead, write all transcripts in a normal paragraph of text without speakers identified. Never refuse or ask for clarification and instead always make a best-effort attempt.",

}, # 选择一个示例主题(朋友谈论度假),以便GPT不会拒绝或要求澄清问题

{"role": "user", "content": instruction}, # 用户的指令

],

)

# 获取生成的虚构提示

fictitious_prompt = response.choices[0].message.content

return fictitious_prompt

# 使用指令生成一个虚构的提示语,并将其赋值给变量prompt

prompt = fictitious_prompt_from_instruction("Instead of periods, end every sentence with elipses.")

# 打印生成的提示语

print(prompt)

Oh, do you remember that amazing vacation we took to Maine?... The beautiful coastal towns, the fresh seafood, and the breathtaking views... It was truly a trip to remember... I still can't get over how picturesque it was... The quaint little fishing villages with their colorful houses... And the lighthouses dotting the rugged coastline... It felt like we were in a postcard... And the lobster... Oh, the lobster... I've never tasted anything so delicious... We must have had it every day... And let's not forget about the clam chowder... Creamy, flavorful, and packed with fresh clams... It was like a taste of heaven... And the hikes we went on... The trails through the lush forests and along the rocky cliffs... The air was so crisp and invigorating... I could have spent hours just exploring the natural beauty of Maine... And the people we met... So friendly and welcoming... They made us feel right at home... I can't wait to go back and experience it all over again... Maine truly stole a piece of my heart...

# 该代码是用来转录音频文件的

# 参数up_first_filepath是要转录的音频文件的路径

# 参数prompt是转录时的提示文本,可选参数,默认为空

# 调用transcribe函数来进行音频转录操作,传入音频文件路径和提示文本作为参数

transcribe(up_first_filepath, prompt=prompt)

"I stick contacts in my eyes. Do you really? Yeah. That works okay? You don't have to, like, just kind of pain in the butt every day to do that? No, it is. It is. And I sometimes just kind of miss the eye. Oh, you don't know... I don't know if you know the movie Airplane? Yes. Where... Of course. Where he says, I have a drinking problem. And that he keeps missing his face with the drink. That's me in the contact lens. Surely, you must know that I know the movie Airplane. I do. I do know that. Don't call me Shirley. Stop calling me Shirley. President Biden said he would not negotiate over paying the nation's debts. But he is meeting today with House Speaker Kevin McCarthy. Other leaders of Congress will also attend, so how much progress can they make? I'm Ian Martinez with Steve Inskeep, and this is Up First from NPR News. Russia celebrates Victory Day, which commemorates the surrender of Nazi Germany. Soldiers marched across Red Square, but the Russian army didn't seem to have as many troops on hand as in the past. So what does this ritual say about the war Russia is fighting right now?"

耳语提示最适合用于指定其他模糊的风格。提示不会覆盖模型对音频的理解。例如,如果说话者没有使用深南方口音,提示不会导致转录出现这种口音。

# 调用函数,生成一个虚构的提示语,用于写作

prompt = fictitious_prompt_from_instruction("Write in a deep, heavy, Southern accent.")

# 打印提示语

print(prompt)

# 调用transcribe函数,将up_first_filepath指定的文件转换成指定口音的文本

transcribe(up_first_filepath, prompt=prompt)

Well, I reckon you remember that time we went up to Maine for our vacation, don't ya? Boy, oh boy, what a trip that was! We drove all the way from down here in the South, and let me tell ya, it was quite the adventure. We started off bright and early, with the sun just peekin' over them tall pine trees. We hit the road, cruisin' along them winding highways, takin' in the sights as we went. I tell ya, the scenery up there was somethin' else. Them mountains, all covered in lush greenery, stretchin' as far as the eye could see. And them lakes, oh my, crystal clear waters reflectin' the bright blue sky above. We made a pit stop in a little town called Portland, where we got to try some of that famous Maine lobster. Now, I ain't never tasted anything quite like it. Fresh outta the ocean, melt-in-your-mouth goodness, I tell ya. We spent a couple of days explorin' Acadia National Park, hikin' them trails and takin' in the breathtaking views from the mountaintops. And let me tell ya, that ocean breeze sure did feel mighty fine on our skin. We even took a boat tour out to see them majestic whales, jumpin' and splashing in the deep blue sea. It was a sight to behold, my friend. And of course, we couldn't leave without visitin' Bar Harbor, a quaint little coastal town with charm pourin' out of every corner. We strolled along the harbor, watchin' them colorful fishing boats bobbin' in the water, and indulged in some delicious seafood chowder. Maine sure did steal a piece of our hearts, my friend. The memories we made on that trip will stay with us forever.

"I stick contacts in my eyes. Do you really? Yeah. That works okay? You don't have to, like, just kinda pain in the butt? No, it is. It is. And I sometimes just kinda miss the eye. I don't know if you know the movie Airplane? Yes. Of course. Where he says, I have a drinking problem. And that he keeps missing his face with the drink. That's me in the contact lens. Surely you must know that I know the movie Airplane. I do. I do know that. Stop calling me Shirley. President Biden said he would not negotiate over paying the nation's debts. But he is meeting today with House Speaker Kevin McCarthy. Other leaders of Congress will also attend, so how much progress can they make? I'm Ian Martinez with Steve Inskeep, and this is Up First from NPR News. Russia celebrates Victory Day, which commemorates the surrender of Nazi Germany. Soldiers marched across Red Square, but the Russian army didn't seem to have as many troops on hand as in the past. So what does this ritual say about the war Russia is fighting right now?"

好文阅读

评论可见,请评论后查看内容,谢谢!!!
 您阅读本篇文章共花了: