Transcription Types for Real-time Settings and Recorded Media

In this presentation, Daniel Sommer, Chief Operating Officer at Empire Caption Solutions, looked over at the different kinds of transcription services available, how they work, who they help, and the technology that supports them.

Thanks to Our Sponsors

Empire Caption Solutions strives to create inclusive experiences and engage individuals with different abilities and backgrounds by providing high-quality accessibility services for recorded media, such as closed captions, transcriptions, Audio Description, and ASL interpretation. By utilizing both the latest technology and human expertise, ECS is able to help its clients meet WCAG 2.1 success criteria and ADA compliance while offering options that fit almost any budget.

AccessiCart: Website accessibility can be complex for eCommerce sites, where many functionalities involve user interactions. AccessiCart offers customized assessments, consulting, and strategy. We evaluate the eCommerce store and its processes, prepare a detailed report about its accessibility issues (an audit), and offer a prioritized gameplan for making and keeping your website more fully accessible, focusing on the most impactful and important changes first.

Watch the Recording

If you missed the meetup or would like a recap, watch the video below or read the transcript. If you have questions about what was covered in this meetup please tweet us @EqualizeDigital on Twitter or join our Facebook group for WordPress Accessibility.

Read the Transcript (Clean Verbatim vs. TypeWell)

Here we have both transcripts so you can compare the difference between a clean verbatim transcript vs. a TypeWell transcript.

Clean Verbatim

[00:00:01] AMBER HINDS: There are a few announcements before we officially get started. If you haven’t been before, we have a Facebook group. If you go on Facebook, and you search WordPress Accessibility Meetup, or sorry, not WordPress Accessibility Meetup, just WordPress Accessibility, then you will find the Facebook group. It’s also facebook.com/groups/ wordpress.accessibility is the URL.

[00:00:30] That’s a great way to connect with people in between meetups, get answers to questions, share ideas, or just generally, talk about accessibility. We occasionally get questions that are also not WordPress. If you’re interested in talking and connecting between meetups, that’s a great place to go and join the group. You can find upcoming events and recordings of past meetups at equalizedigital.com/meetup.

[00:00:58] This is a frequently asked question. This meetup is being recorded. The video will be available in about two weeks. We want to have corrected captions and transcript first, and then we will post that up. That is where you can find it. If for some reason, you can’t stay for the whole event, or you want to catch a replay to review something that you learned, that video will be available on our website in about two weeks.

[00:01:26] If you want to get notified when those are available or get other web accessibility news, we recommend joining our email list. You can sign up for that at equalizedigital.com/ focus-state. You, also, if I have the meetup set properly, will probably have an opt-in form as a thank you page after you leave zoom, but if not, that’s where you can go sign up. We send two emails a month, so it’s not a lot, but it has upcoming events. It has other news from around the web related to accessibility, and it has links to the recordings of them come available.

[00:02:07] That is a great way to stay up to date on everything that’s happening. We are seeking sponsors for the Meetup. We do rely on sponsors to help us cover the cost of making it accessible. Unfortunately, the WordPress Foundation doesn’t have funds to cover live captioning or sign language interpretation if we want to be able to offer that. If anyone or their company is interested in helping to sponsor Meetup, please reach out to us. We would very much appreciate it.

[00:02:35] There is also information about sponsoring on that same meetup webpage. If you have any suggestions for the meetup, or you need anything, any accommodations that would make it work better for you, please reach out to us at meetup@equalizedigital.com. That will go to both myself and Paola, our other co-organizer, and is a great way to connect with us. We love to hear from people, and we’ll do whatever we can to make it work well for you.

[00:03:12] Who am I? I’ve been talking, and I haven’t introduced myself. If you’ve been before, you probably know, but if you haven’t, I’m Amber Hinds. I’m the CEO of a company called Equalize Digital. We’re a certified B corporation that specializes in WordPress accessibility. That’s pretty much all we do. We also have a software product, a WordPress plugin called Accessibility Checker, which helps find some accessibility problems, the ones that can be identified automatically on your WordPress website, and it puts reports, like the SEO plugins do, on the Post Edit Screen.

[00:03:48] You can find out more about that on our website if you want. We’re on Twitter @EqualizeDigital. We have two sponsors today, and actually, one of our sponsors is also our speaker, which is fun. Empire Caption Solutions, which you’re going to get to hear more about them I’m sure when our speaker Daniel introduces himself, they have very generously, for almost a year now, I want to say, they have sponsored our transcript and SRT creation so that we can have captions on the recordings of our videos that are accurate for everyone after the fact.

[00:04:30] We, very much, appreciate that because we were doing it ourselves, and when you

are, you guys, this is the horrible secret that is not a secret if you ever see me type, is that I type like 40 words per minute or something incredibly slow. It was taking us, I don’t know, all day to caption one meetup video. We very, very much appreciate this. In addition to providing the captions, they also do audio description. They do sign language interpretation, and Daniel is just a great guy. We’re really excited to have him here. You’ll hear more from them later on.

[00:05:14] Their website is empirecaptions.com, and I know you guys, Daniel, you don’t use Twitter too much, but I always tell people to give us a shout-out, a thank you to our sponsors on Twitter, because I think that helps encourage them to want to keep sponsoring, so they are @EmpireCaption on Twitter, and then our live captions are being covered by Bet Hannon’s company AccessiCart. AccessiCart is relatively new, but it’s just the name that’s new. Bet’s been doing accessibility in her team for a long time.

[00:05:52] They have a lot of experience with WooCommerce, so they are really specializing in eCommerce accessibility and WooCommerce accessibility. If you have a WooCommerce store, definitely check them out. She is kindly covering the cost of our live captioner today. You can learn more at accessicart.com. That’s A-C-C-E-S-S-I-C-A-R-T.com. They are @AccessiCart on Twitter if you want to tweet a thank you to them as well.

[00:06:34] We have three upcoming events that I just want to make sure everyone is aware of. Our next meetup will be held by Colleen Gratzer, who I think I saw in the attendee list today. She will be talking about using InDesign to create accessible PDFs, which web accessibility guidelines don’t just apply to your website. They also apply to any digital content you put out on the web, which would include your PDFs, some of the top mistakes, and how you can fix them when you’re creating PDF documents, so that will be very exciting.

[00:07:10] Then our normal meetup, which would be this time in November, we are not having, but it’s because we’re going to be doing WordPress Accessibility Day. Registration is open. If you go to the website, it is a full 24 hours of great talks on accessibility panel discussions. There’s going to be some short lightning talks, all different levels, design, development, content, all kinds of things. We highly encourage you to go. It’s totally free to register, and registration is open now.

[00:07:46] Then our next meetup will be after that, will be Monday, November 21st at 7:00 PM, and Alicia St. Rose will be talking about accessibility in your content. We have a lot of great things coming up. I am very excited to introduce our two speakers. I’ll add a spotlight for them in just a second, but Daniel Sommer and Kate Ervin. Daniel is the COO, I’m pretty sure I

remember this right, at Empire Caption, and Kate Ervin is the Executive Director of TypeWell.

[00:08:23] They are going to be speaking with us today about transcription types and different ways that you can approach captioning and transcriptions. I’m going to hide myself and stop showing my screen and let them take over. If you do have any questions, I will be watching the Q and A, so please put those in. Then we will get to those as we’re able.

[00:08:50] DANIEL SOMMER: Great. Oh, I’m up. [chuckles] Hello. Amber, thank you so much for having us. We’ve doing the captions and the transcript for a year now about, and it’s really wonderful to be able to talk a little bit more about the service. I’ve learned a whole lot about WordPress accessibility and web accessibility, just by being part of these meetups and reviewing all the transcripts and everything. It’s so wonderful to be a part of this and to have such a resource.

[00:09:28] One of the things you said in the introduction was, as people who are developing WordPress sites and content for the web, it’s not just that the sites themselves are accessible, but all the content that goes on there is also accessible. What I’ve found, there’s so many different ways to go about transcribing something. There’s different methods.

[00:09:51] Even in the beginning when we were first working together, we had to spend a little bit of time to get on the same page about what exactly we were providing versus what you were expecting and get on the same page with expectations. My hope is today that we’re able to go through some of the more nuanced details of transcription. I’ve invited Kate here today to join me to talk also about a lesser-known type of transcription called meaning-for-meaning. There’s all sorts of really wonderful uses for that.

[00:10:31] Kate and I have been working together for about 10 years or so. Kate is the Executive Director of TypeWell and also one of the head trainers and experts on the subject. I’m really glad that we’re able to have this talk today. We’ll be talking more about meaning-for meaning towards the end. We’re going to start by looking at transcription in general. I have a little slideshow I’m going to bring up.

[00:11:01] If any questions arise during the talk, you just put them in the chat, and Amber or Paola will get them to me, but we’ll, of course, have some time at the end. Want to thank the captioner who’s doing the captions normally as usual, and a little bit later, I’m going to be introducing a live TypeWell transcript or meaning-for-meaning transcript. We’re just going to give that a little more context before throwing that at you. Kate, is there anything you’d like to say

before we dive in?

[00:11:35] KATE: No, you go ahead. I’ll be glad to join in later once you’ve got things up. Thanks again.

[00:11:41] DANIEL: Thanks, Kate. Let me just get this screen going. Excellent, this is great. I’m from New York. Before I got into transcribing and captions and accessibility, I was as a professional classical singer in New York. In New York is one of the oldest bars, McSorley’s. When you go to McSorley’s, you walk in, and you ask for light or dark.

[00:12:31] They only have two options for their beer. You walk in, and you know what you’re going to get. If they give you the wrong one, it’s really clear just by looking at it, even before you taste it. If you make it so far as the taste, you’re going to know if it’s the right drink or not. Now they’ve mastered their brew, and it’s a consistent experience every time. If you go down the block to Blind Tiger, they offer around 30 or 40 different types of beer on tap, in addition to bottles and everything else.

[00:13:15] If you’re going into Blind Tiger and saying, “I’d like a dark beer, you’re going to be bombarded with a whole bunch of questions,” or, “What other beers do you like? What do you drink? How can I give you what you want?” If you walk into McSorley’s, it’s going to be pretty clear that you’re going to get what you’re looking for. Why am I bringing up beer and McSorley’s? Well, so often, almost every day, I have conversations with people, and we think we’re being clear when I say, “I want a transcript.”

[00:13:53] Well, what does that really mean? Where are you using it? Who’s using it? The people who are using it, is it going to be effective and beneficial for them? When are they getting it? Where are they getting it? How are they accessing it? All of these things are really important questions, and we can’t always have a very simple conversation where you just say, “I want the light one,” or, “I want the dark one.” All of these conversations are more nuanced. That’s what I’d like to talk about today.

[00:14:33] What do we mean by transcription? As a musician, do we mean musical transcription, or is someone playing a melody, and are we transcribing it down or dictating it? We’ve got biology and genetics, DNA sequences, that’s also a type of transcription. Mathematical or numbers, phonetic. If someone’s in linguistics or NLP speech pathologist, how do we transcribe sound? That’s a whole with the international phonetic alphabet.

[00:15:20] Transliteration and transcription often get conflated or used interchangeably. Transliteration, taking from Arabic or Russian from a non-Roman alphabet and transliterating it or translating it, transcribing it into the Roman alphabet. Then to get to what we’ll be talking about today, lexical or verbatim transcription, which involves the spoken word or spoken sounds and putting that into the written format.

[00:15:56] In all of these, the specifics are slightly different. We’re taking something from one medium, and we’re putting it into another. With the verbatim or the lexical, we’re taking spoken language or sounds and words and putting that into the written format. This is where things can start to get a little hairy, a little complex.

[00:16:26] This is why, number one, having the conversation is so important to understand and also, to understand that there are so many different words that can come up with this. At the end of the day, we’re really only talking about a few things, but I’d like to look at some of the terms that we see in the different settings.

[00:16:49] Verbatim, full verbatim, complete verbatim, clean verbatim, intelligent verbatim, condensed verbatim, full transcript, stenography or stenographers, re-voicing, voice writing, dictation, CART, which is communication access translation or transcription. There’s still some disagreement there. Meaning-for-meaning, summarization, note-taking, SRT transcript, lexical transcription, and monolingual translation.

[00:17:24] If you’re working in the education realm, certain words, vocabulary, and terms are used. If you’re working with a localization firm, different words and terms are used. If you’re working with a marketing firm, different terms. It really depends on the context that’s going to determine which of these services you really need, and what’s going to be effective in those settings.

[00:17:53] I was going to talk about this a little later, but I wanted to bring this up because we have some other people working behind the scenes today, and I want to introduce them sooner rather than later. A question that often comes up, are transcripts and captions the same? Even as I’m looking at the Zoom meeting right now, it says, “Live transcription/closed captions has been enabled. Who can see this transcript?”

[00:18:24] If you do turn on the closed captions on the Zoom, they’re going to be appearing instantly, if you enable the live Zoom transcript, that’s going to be somewhat slightly delayed, but the format of those and the layout of those and the timing of those is really what makes one of

the biggest distinctions. When we’re talking about transcripts and captions, they both contain the spoken texts. They both contain the words that are being heard in the audio or the video.

[00:18:59] Now, transcripts contain the entire text, so usually, in paragraph form. We’re going to go over what good transcript practices are. Essentially, the person reading the transcript will have access to the transcript from the time it starts until the time it ends. They can go back and look at what was said before. They’ll see what’s happening as it comes.

[00:19:25] Captions, on the other hand, they’re displayed on the screen with the video for a set period of time, and then they disappear. Even though the text content is the same, the way that we’re digesting that, and the way that we’re able to experience is different. One is very immediate, where we have to stay locked in, we have to look at it, and if we miss it, it’s gone. In that sense, it’s more like the audio experience of if you miss something, it’s gone. You can’t go back. You can’t rewind. You have to go forward.

[00:20:04] Transcripts record the whole event, and you can go back and view them, even during the presentation. I think I mentioned there’s a few different types of transcriptions. Typically, we’re used to what’s called verbatim. That means all the words are captured and displayed. That’s what you’re seeing with the closed captions today when you click the closed caption button. If you turn those on, one or two lines appears at a time, the words show up, it’s great. I guess a while ago, Meryl was– blanking on her name. She’s brilliant.

[00:20:52] AMBER: Meryl Evans?

[00:20:55] DANIEL: Thank you. Meryl Evans gave a presentation on closed captions. She really very insightfully said, not everyone likes closed captions. Some people prefer the transcript. The way that they process information is better suited towards the transcript and reading it through the transcript versus seeing it just for a set period of time and then moving on.

[00:21:22] What we’ve done today is we’re giving access to two types of transcription. One is the verbatim. It’s going to be the same text that you’re seeing in the captions, just in paragraph form. I think Paola has the link for that if you’d like to view it. Then also, I’m joined, behind the scenes, by my business partner, Diana Lerner, and she’s doing what’s called TypeWell or meaning-for-meaning transcription.

[00:21:53] We’re going to talk about it, but it’s much more concise, and it’s geared a lot for educational settings. I’m going to talk a lot more about it with Kate later in the talk, but I wanted

you to have access to both of those at the beginning, so you can bring them up. You can read through them. You can see the differences. We’re going to go into those differences now a little more in-depth. I just want to thank the captioner and Diana both for doing that. Feel free to load both of those, put them side-by-side to see the differences.

[00:22:32] Going on, is transcription a form of translation? It’s a really good question. If we look at the OED, the Oxford University Press Dictionary, the translation is defined as the conversion of something from one form or medium to another. In that sense, we’re taking an auditory experience, a heard experience, and translating it into a visual experience. There’s different things we do in those mediums or when we’re experiencing something auditorily and when we’re exploring or experiencing something visually.

[00:23:27] Very often, when we’re engaged in conversation, or we’re listening to a presentation, especially if there’s more natural speech, there’s a lot of, I’m going to call, filler words. Um, huh, you know, like, and very often, we’re able to filter out that noise, and we’re able to hear the core concepts that are being shared in the main message. If we weren’t to hear the ums, huhs, it

really wouldn’t affect our understanding so much.

[00:24:12] We don’t want all of that auditory noise that we often just filter out to be present visually. Do we want to recreate that visual noise, or are we trying to create a transcript that is more suited for reading and more suited for the page? If we are translating from one experience to the other or one medium to the other, what are the important things that we need to consider, and how can we make sure that we’re providing equal access to the people who need it?

[00:24:53] I’d like to go over now, just regardless of the setting, whether we haven’t even talked about it in real-time, post-production, recorded media, broadcast, wherever the transcript is going to be, what are the core parts that are going to make it usable and accessible and give people access to the information they need?

[00:25:19] Number one, who is speaking? We need to know who is speaking. Very often, if there’s captions we see the text, and we see the person, so we know, but if we’re only experiencing the transcript without the audio, or we’re relying on the transcript for the information, we really need to know who is speaking. That’s number one. Number two, [chuckles] what are they saying? What’s being spoken? What’s being heard in that audio?

[00:25:55] Then a continuation of that, any relevant environmental sounds or sounds that are not words or verbal sounds but sounds that occur that are important to understanding what’s

happening? Then to go a little bit more into the nuance of who is speaking, it’s important to include a name or a title, if it’s known, and if the name or person or the title isn’t known, we want to avoid things like gender or appearance identifiers unless we know for certain.

[00:26:41] Sometimes it’s difficult to assign someone’s gender or also, just identify them based solely on appearance. Unless you know the person in the recording, and you’ve spoken with them and agreed upon that, typically, avoiding gender or appearance is preferred. Then indicating when speakers change. Making sure that when the speaker stops and the new one begins, that that’s clearly indicated.

[00:27:15] Then what are they saying? How do we accurately capture a speaker’s message? We’re going to talk more about accuracy in a bit, but it needs to be an accurate representation of what was said in the written format. Needs to use proper writing conventions so that people reading it can follow along. Then where it’s slightly different from captions is that we have more control over the layout of the page and how easily we set it up for people to read and to digest the information.

[00:28:01] Are we making clear paragraph breaks? Are we using white space appropriately? Is the page laid out in a way that allows us to consume the information in an easy way? Then the last part, and this is very important, but it’s more subjective than the other parts, what is a relevant environmental sound? Noises, alarms, phones, animals, something that is going to add important information to the reader.

[00:28:45] Just as a quick side, early on, I was typing for a live class, and my student did not hear that there was a fire alarm. There was no visual lights or any cues to indicate that there was a fire alarm. No one had come over and tapped this person on the shoulder and said, “Hey, there’s a fire alarm.” Sometimes you don’t think of these things, but if people are looking at a transcript for access to real-time, all of these sounds are super important.

[00:29:27] They do have real-life consequences, in addition to sharing the message and making sure everything’s represented clearly. Then laughing, applause, crying, chuckling, sounds that we make that are not words but that are part of the communication. It’s equally important to include if something is indiscernible or Inaudible because what we’re trying to do is create an accurate representation of what’s in the audio or video.

[00:30:02] If something can’t be heard, or it’s Inaudible, or you can’t tell what it is, even after listening to it a bunch of times, it doesn’t make sense to try to fill in the gaps if it’s not accurately

representing what’s in the audio. That’s important. Then this is also something that’s very subjective, changes in speaker’s tone or manner of speaking. If someone changes the inflection, or [speaks in falsetto] someone speaks in falsetto, with a very high voice but typically, speaks like this, that is important to indicate.

[00:30:49] It’s very difficult, in this aspect, to not interpret what someone means by the change in tone or in manner, but rather, how do we accurately describe what is happening in a way that is not judging or assuming what’s happening. This is the more nuanced parts of it. What I’d like to do now is look at three different transcripts. I’m going to play the audio at the end, but I’m going to put them on the screen one at a time. We’ll just start with this first one.

[00:31:34] This is of an opening to a meeting. I’m going to be quiet for a moment and just leave this up for 30 seconds or so. Just take your time and look through it, skim it, go word by word, however you’d like.

[00:31:50] AMBER: I actually wonder, I think that we should read it.

[00:31:55] DANIEL: Okay.

[00:31:56] AMBER: Only because we may have someone on this call who is not able to see the slides.

[00:32:02] DANIEL: Is it cool if we wait a minute, and then I read it?

[00:32:07] AMBER: Yes, we can pause for a minute, and then if you don’t necessarily think it’s important to read it all, at least the section that you want to highlight, we want to make sure we read.

[00:32:18] DANIEL: Absolutely, thank you.

[00:32:35] [pause ]

[00:32:38] Three accurate transcripts. Number one:

[00:32:42] Teri: I’d like to welcome everyone, um, to the plenary session, um, roundtable discussion Working 9:00 to 5:00 in your Pajamas. Life in the Times of COVID-19. Um, presenting today or speaking on panel will be John Fudrow from the University of Pittsburgh

Libraries, Aura Young University of North Carolina Charlotte Graduate School, Elyse Fox, and Dana Dickman from Sacramento State University Library. The moderators today are myself Teri Green from the University of Toledo, and Stacy Wallace. I don’t know if Stacy is on here yet.

[00:33:24] I don’t hear her. Um, so, um, I’ll just continue to go along.

[00:33:28] I’m going to stop there. Is that okay? Okay Cool. Go to this next one, and we’ll do the same. I’ll just leave it up for about 30 seconds, and then I’ll read it.

[00:34:09] [pause ]

[00:34:10] Three accurate transcripts. Number two:

[00:34:13] Teri: I’d like to welcome everyone to the plenary session, roundtable discussion working 9:00 to 5:00 in your pajamas, life in the times of COVID-19. Presenting today or speaking on the panel will be John Fudrow from the University of Pittsburgh Libraries, Aura Young, University of North Carolina Charlotte Graduate School, Elyse Fox and Dana Dickman from Sacramento State University Library. The moderators today are myself, Teri Green from University of Toledo and Stacy Wallace. I don’t know if Stacy is on here yet. I don’t hear her. I will just continue to go along.

[00:34:58] There’s one more paragraph, but I’m skipping. Then onto the last one. I’m going to leave it up and then read.

[00:35:27] [pause ]

[00:35:30] Three accurate transcripts. Number three:

[00:35:33] Teri: I’d like to welcome everyone to the plenary session – Working 9 to 5 (In Your Pajamas). Life in the Time of COVID-19. Presenting today or speaking on the panel are John from Univ. of Pittsburgh Library, Aura from UNC-Charlotte, Elyse and Dana from Sacramento State Univ. Library. Today’s moderators are me, Teri Green from University of Toledo, and Stacy Wallace.

[00:35:58] I don’t know if Stacy is on here yet. I don’t hear her. I’ll continue to go along. [00:36:09] All three of those were transcripts of the same audio, the same event. I’m putting

them up here now side by side. I tried to make the text as readable as possible, but it’s a lot. These are the three transcripts that came from this one audio. I would say that they’re all accurate. I would say that they all convey the same information. After reading each of them, you walk away with the same understanding of what’s going to be happening during this presentation, but what’s different?

[00:36:56] I used to teach, so I was waiting for hands to go up. I’ll answer my own questions. We noticed, in the first one, all of the filler speech is included, the ums, the false starts, and the extra speech, so that was said. That was said, but when we’re looking at it in this verbatim transcript, for me, it’s clear, but it also disrupts the flow of the reading in many ways. I know what was said now, but if I were to look at this as a transcript, apart from the audio, it’s a little bit more difficult.

[00:37:49] If I want to experience this in a written format, when we look at this one in the middle, all of that filler speech has been taken out, all the ums, the huhs. The sentence flow is much closer to a written format or if someone were to have scripted this. In many ways, this sets up the reader to digest the information more easily and without transcribing that auditory noise into the written and making that visual noise.

[00:38:31] The translation of the auditory noise is not included in this, but it increases readability and how fast someone can go through and get this information. Now in this last one, there’s a lot less or a lot fewer words on the screen. I think if we look at it, there’s no less information shared. We don’t have the people’s last names, but we know who we’re talking to. It’s more concise. If someone is just reading through this, they also have equal access to the information and the experience that’s happening.

[00:39:23] In this last one, this one is where I’m going to bring in Kate in a little bit. This last one is what we call meaning-for-meaning transcription. The one in the middle is clean verbatim, where everything is spoken, and everything said is captured, but that filler noise is cleared out. Then the first one, that’s full verbatim, where everything is captured no matter what and put into written format as easily as possible.

[00:40:01] AMBER: Can I jump in for a sec? We had a few things in the chat that I thought might be interesting.

[00:40:07] DANIEL: Please.

[00:40:10] AMBER: One person did comment when you were having us read that the ums were making me tired when they saw the first one. Then there was a question about that third transcript. I don’t know if you want to go back to it for just a second, but it has a lot of abbreviations, and they were wondering if this interferes with accessibility.

[00:40:32] DANIEL: Oh, in terms of screen reading and things like that, or–

[00:40:40] AMBER: Generally. I have thoughts, but I don’t know if you have thoughts on it. [chuckles]

[00:40:44] DANIEL: Good question and good observation. I think a lot of that is going to be cleared up when I bring Kate on, and we talk specifically about meaning-for-meaning and the contexts that it’s used in. That’s a great point. Can we come back to that after we talk a little bit more about– We might cover it, and if we don’t, if we can make sure we do, that would be great.

[00:41:16] AMBER: We can definitely circle back to that. I’ll write it down. Another question that someone asked, “Is it necessary to include the comment about Stacy not being there because that was kept in all of these?”

[00:41:30] DANIEL: Yes, she said Stacy wasn’t there yet. It’s an important piece of information. You bring up a great point because, in many ways, we’re saying this needs to be a verbatim, accurate transcription of what was said, but at some point, there is interpretation that goes into it. What is important? What do we need to include? What should we exclude? All of those things are really good questions, but when it comes to “I don’t know if Stacy is here yet,” that seems to be a main point.

[00:42:14] Stacy is going to be presenting. She’s not here. We don’t see her. We can expect her coming later. In some ways, it is the job of the transcriber to decide what is most important, but ideally, we want to capture as many of the main points, and we want to capture as much of the information that’s shared as possible. We see Stacy isn’t here yet. They all say, “I don’t hear her.”

[00:42:49] “I don’t hear her” could maybe have been left out, but that Stacy isn’t there seems important. This is also, and I think, once talk more about meaning-for-meaning and bring Kate into the conversation, a lot of those questions are going to be clear because TypeWell, meaning-for-meaning, I think, are the type of transcription service people are least familiar with. It is geared for specific context and uses. It is different, but I think it’s important we look at all of

those questions and considerations.

[00:43:28] Did that answer that for now?

[00:43:30] AMBER: I think so. I think that was all of the questions for now, and then I’ll pop back in if I see more.

[00:43:37] DANIEL: Great. Excellent. Oh, I think that that was actually really great questions because it brought up what is really meant by accuracy. What do we need to include? What can we exclude? How do we make sure that someone is getting clear access to information, equal access to information? I think we addressed quite a lot of that.

[00:44:13] When we’re talking about accuracy, there’s, of course, are the words spelled correctly, but also, is what was captured representing what was said accurately? Is punctuation accurate? There’s all sorts of ways of measuring accuracy, and a lot of it really depends, again, on the context. Where is this being used? How are we measuring the information that’s being shared, and how are we measuring accuracy?

[00:44:48] It’s really important to understand are you looking for full verbatim. Are you looking for clean verbatim? Are you looking for access, in which case, meaning-for-meaning is going to be great? You can’t just have to say, this is 100% accurate or 99% accurate. You really need to know what you’re measuring. If you’re doing this yourself for recorded videos and such or recorded audio, you need to know what level are you comfortable, how much time do you want to spend.

[00:45:22] As Amber said in the beginning, you can spend a whole day doing a simple one-hour video depending on how detailed it needs to be. How much do you want in there? What are you considering to be accurate for the people that you’re providing the service for? Where is this transcript going? What are they going to be considering accurate and what’s going to be useful for them?

[00:45:50] That’s to say, I don’t have a clear answer as to what is meant by accurate other than you have to ask the question for yourself and understand what services you’re requesting and make sure that those services match your expectations and what you paid for, or if you did it yourself, that you have someone looking it over with the same set of eyes and the same set of criteria for what accuracy means.

[00:46:17] Oh, great. Just before we leave this, I wanted to play the audio. It’s about a minute long. Just wanted to play this. You can pick which one you’d like to look at, but they’ll all be here. Here we go.

[00:46:36]

[00:46:36] TERI: I’d like to welcome everyone to the plenary session, roundtable discussion working 9:00 to 5:00 in your pajamas, life in the times of COVID-19. Presenting today or speaking on the panel will be John Fudrow from the University of Pittsburgh Libraries, Aura Young, University of North Carolina Charlotte Graduate School, Elyse Fox, and Dana Dickman from Sacramento State University Library. The moderators today are myself, Teri Green from University of Toledo, and Stacy Wallace. I don’t know if Stacy is on here yet.

[00:47:20] I don’t hear her, so I will just continue to go along. Before we begin, during the presentation portion, just keep your microphone or phone muted unless you’re a speaker. Please feel free to use the chat feature to post questions. Those are going to be addressed during the Q&A portion. I’d like to welcome everyone, and I’m going to go ahead and start asking each presenter a question that they all received already. Let me do that.

[00:48:04] DANIEL: Great. That was that, just so you could have access and hear the audio that these transcripts were based on. Now that we have a basic understanding of what transcripts and transcription is, I’d like to look at the specific settings. I’d like to go through this, somewhat quickly, because I’d like to bring Kate in, so we have enough time to chat and then get to your question.

[00:48:36] The most common, it’s obviously, pre-recorded media, audio, and video files, and we talked about this. The type or manner of transcription is typically full verbatim or clean verbatim. You might see other terms like complete verbatim for full verbatim, and some people call clean verbatim intelligent verbatim. These two terms are the ones you’re going to see the most often, and these are probably the two types of transcription that you’ll be considering for any audio or video project.

[00:49:15] In terms of delivery, you expect some text document, a text file, Word DOC, PDF, or an HTML file. It can be static, so just that document with the text, but we are seeing more and more interactive transcripts, and that’s where– Just to show a quick example of that. See if I can get a new share here. Just as a quick example.

[00:49:58]

[00:50:00] TERI: I’d like to welcome everyone to the plenary session, round table discussion Working 9:00 to 5:00 in your Pajamas: Life in the Times–

[00:50:10] DANIEL: An interactive transcript where the words are highlighted one by one as they’re spoken. In some ways, this is time-coded and does require another step to add that time coding, but this is becoming more and more popular and people can click the transcript and go through it and be brought to the space in the audio or the video. Let me just switch back. Thank you.

[00:50:51] Then how do we go about this? Of course, there’s automatic speech recognition. There’s tons of tools that do that. Obviously, if you use automated speech-to-text tools, automated speech recognition you’re going to want to check the quality of the accuracy of what’s being spoken. It is getting really quite good.

[00:51:18] It is especially if the quality of the audio is clear and the microphones used are high quality. A lot of people now are doing ASR plus human editing. There’s more and more tools for that. Good old QWERTY keyboard just type it out. Stenography, stenographers typing in with their steno pads and their steno keyboards. Then what’s also becoming more popular now is voice writing.

[00:51:52] This is different than just dictation or using the Google speech-to-text tool in a Word document or in a Google Doc. When they’re trained, it can be at the same level as stenography and the same speed. Then we come to real-time or live which is what we’re doing now.

[00:52:26] Even though the end product is the same for the user you’re seeing the text, the way that we go about it is very different. Also because it’s in real time, the skills of the person providing that service need to be very high. They need to be able to capture the speech in real time with 1 to 3 seconds delay.

[00:52:55] The skills and the tools that are used are very different. We have verbatim and most of the time when we’re talking about live transcription, or live captioning, it’s usually clean verbatim. It’s very rare to see full verbatim in live settings. It’s not impossible, but it’s less common. Again, CART Communication Access Real-Time Transcription or Translation.

[00:53:32] Then we also have meaning-for-meaning. Kate, thank you. You’re amazing. I’m going

to bring you on in a moment to talk more about meaning-for-meaning. Unlike recorded media where you can just give a text document if they want they can read the document with or without the audio. When we’re doing live transcription there’s a few players that are used to share the text in real time.

[00:54:08] StreamText, which is the tool that we’re using today. If you clicked on those links that Paola provided earlier, that’s through StreamText, and almost every live transcriber, whether they’re CART or verbatim, meaning-for-meaning, voice writer, stenographer, most people providing real-time service use StreamText, or 1CapApp for their text delivery.

[00:54:39] The writer can hook up to those tools and then it’s broadcast almost instantly. StreamText, I believe it’s supported by screen readers and I’m not certain about 1CapApp. Then of course the video conferencing platform. Today we’re using Zoom. Zoom does have a transcript option, I believe it’s slightly delayed, and that’s because of Zoom, not because of anything that the providers are doing.

[00:55:14] There’s just a delay in the Zoom transcript, but the captions tend to appear in real time so have that. Then methods are very similar. ASR. Many people use ASR. It’s better than nothing, but a lot of times there’s inaccuracies that are embarrassing, or just give the very wrong message.

[00:55:45] Stenography is the classic and voice writing becoming more popular. I think one of the interesting things, and one of the interesting possibilities of voice writing is the potential for live transcription in other languages, which is very difficult at this point. There’s very few people who do live transcription in French.

[00:56:10] There’s more who do it in Spanish, but beyond those three, it’s very difficult to find people to do real-time. Then TypeWell, which does meaning-for-meaning, and then C-print, which also does meaning-for-meaning, but in a slightly different way. With all of that, I’d like to now bring on Kate, to help me talk about TypeWell.

[00:56:38] Kate and I have worked together for almost 10 years now. I started out as a TypeWell provider, a TypeWell transcriptionist, captioner and that’s how I got into this industry. Kate became the executive director a few years after I got involved. She’s been just so great at keeping the company focused and delivering excellent training to the providers and making sure that really TypeWell stays a very high-quality method of transcription, but it is very different from verbatim that many people are used to.

[00:57:25] I wanted to talk about those differences with Kate today. She could say all this a lot better than I can. Kate, thanks so much for being here and joining me and for all the great work you do at At TypeWell and excited to share more about TypeWell with this audience.

[00:57:48] KATE: Thanks, Daniel. I should have said this in the beginning, I just want to acknowledge that I’m from Tucson, which is the ancestral and current land of the Tohono O’odham, Pasqua Yaqui, [inaudible] people and because I did not follow best practice and provide the captionist with the spellings of those names, I pasted that in the chat just now.

[00:58:15] As a visual description, I’m a White, Cis woman in my 40s with brown hair, blue eyes, I’m wearing glasses and a blue shirt. TypeWell is live or real-time transcription service provided by a human transcriber, as Dan has explained. We train our transcribers to translate spoken English into clear concise written English, often using fewer words.

[00:58:43] I will talk about why and how because that’s really important. That term meaning-for meaning is meant to distinguish it from word-for-word. I’ve also heard it described as text interpreting, hybrid captioning. It’s hard to come up with a good way to describe it. You kind of only know it when you see it but transcribers who do TypeWell intentionally edit in real-time so that the typed transcript is concise, clearly worded, gramatically correct.

[00:59:19] Usually, the sentences are relatively short. The goal is quick understanding of the speaker’s intended message. I will explain why. Spoken English, as you know is really messy. A lot of false starts, meaningless utterances, repetitions, and sometimes those repetitions are necessary for understanding or for emphasis but many times they’re not.

[00:59:46] There are a lot of sentence fragments, run-on sentences that run on and on and on. That’s just the way we speak. As hearing people, we adjust to that. We’re used to that, but that’s not typically the way we read. At least not if we want to read quickly. TypeWell was designed specifically with deaf and hard-of-hearing students in mind. Particularly, high school, college students who are in the classroom reading a transcript in real-time. They’re in classes all day so their eyes get fatigued. They’re having to consume all of this visual information, not just their transcript, but all of the other things that they have to access visually throughout the day. Taking into account that their English reading level might be slower than that of their hearing peers and taking into account screen time, eye fatigue, and also their desire to participate in the class and not have their eyes glued to a screen the entire time. To that end, TypeWell is usually viewed in a full transcript mode either through StreamText or we have our own web player.

[01:01:05] Our transcribers also make intentional use of white space, paragraphing, and making sure that it’s not just a huge block of text. That’s not just to reduce eye fatigue, but also it’s for topic organization because we expect that students are going to be scrolling back and looking at what they missed or trying to review something. Students need to be able to look away and rest their eyes or rest their head on their desks just like every other student in the classroom does and so they do need to be able to scroll back and see what they missed. Obviously, if they were just seeing captions that appeared on the screen for a couple of seconds and then disappeared, they wouldn’t have that same access. TypeWell transcribers also provide a copy of that full transcript after class for students to study.

[01:01:59] That’s a secondary use of TypeWell, but the real-time access is the primary use. Over the course of an hour, you can imagine that a TypeWell transcriber is making thousands of decisions about which words to include, which words to exclude, and even sometimes, which words to add, for example, words that are implied but that weren’t actually spoken. How do you rephrase or rearrange the words that you hear to make them clear in written format? Where do you insert white space? When do you include nonverbal cues, like a door closing in the background or a joke, or a cell phone ringing? TypeWell transcribers are human, so they have limits. Making all of these decisions in real-time is very exhausting. There’s mental fatigue that sets in. If a speaker is speaking 400 words a minute, the transcriber has to condense the message.

[01:02:58] There’s just no question. We are always making these trade-offs but again, we’re making those trade-offs with the goal of quick understanding. I think humans are able to do that meaning-for-meaning real-time editing far better than a computer can at this point, but those transcripts also contain mistakes just like computer-generated transcripts. You’ll see a variety of skill levels. There’s a range of experience. There’s a range of quality just like every other service you might implement. There are just a lot of factors at play, including the speed of the message, the quality of the audio, the speaker’s accent, background noise, crosstalk, which is when people are talking over each other.

[01:03:51] TypeWell was originally developed to be used in the classroom by service providers who could be trained quickly, who could use a regular laptop computer, and who could type on a QWERTY keyboard. It was designed to be a service that you could implement relatively quickly and affordably if you were a school that suddenly had a deaf student who enrolled and you’re like, oh, we need to get this accommodation up and running. The people who originally developed TypeWell wanted to make it accessible to those transcribers by keeping their training

relatively short and straightforward and they wanted it to be accessible to the schools by keeping the training and the software licenses relatively affordable, but as it turns out what we’re doing is exceptionally useful, not just for deaf and hard of hearing students. Disability communication access is not a monolith, there’s not a one size fits all solution or accommodation and sometimes you’re implementing accommodation for a single individual, whereas other times you’re trying to use universal design and to achieve universal accessibility if such a thing even exists.

[01:05:05] I think when you’re choosing between these different captioning or transcription methods you do need to consider the audience and whether you’re focused on accommodating a specific person to be compliant with the law or achieve accessibility that’s affordable, that everyone might use. We’re taking into account people’s hearing ability, their reading level, their sightedness, their processing speeds, their attention, cognitive abilities, and all of that. That’s my intro to TypeWell and what meaning-for-meaning is about and where it evolved out of. I’m excited to think about and learn about other places where it might useful and applicable.

[01:05:54] DANIEL: Awesome. Thank you, Kate. That was great. One of the things I wanted to bring up that I think is really special about TypeWell is the math mode and the ability to not just do a transcription of what is said, but also to be able to do very advanced math and physics and equations and things. My training is as a musician, so I can count to four. I never got to that part of TypeWell, but that seems to also be a very powerful component of TypeWell specifically for meaning-for-meaning in the educational space.

[01:06:43] KATE: That relates to this question that I see in the chat, which is what backgrounds do TypeWell transcribers come from and they come from all different backgrounds. A lot of colleges hire college students, student workers to become TypeWell transcribers. They send them through our training and they work in the classroom providing services for their peers at the college, in addition to taking their own classes so they could be studying anything and working as a transcriber in any other classes.

[01:07:16] Folks who come from the community have backgrounds in– really to become a TypeWell transcriber to go through a training, you just need to be a fast typist. You need to have strong English skills and listening abilities. We have some screening tests, but other than that, we don’t have any specific requirements about your background.

[01:07:41] We’re not keeping anyone out of the field necessarily. If you have a background in math and science, we will absolutely encourage you to learn to use the math and science

dictionary that’s built into our software and that allows you, again, use the QWERTY keyboard to type complex formulas and scientific notations and embed those into your transcript and to do so relatively quickly. That takes a lot of practice to learn that skill because it’s tricky but we have developed some hands-on training to help transcribers do that.

[01:08:21] As you can imagine, it helps to have a background, if you’re going into a highly technical class to be a service provider, it helps to have background knowledge in that area. If you’re somebody who’s coordinating those services, you want to look at not just do they have a type well qualification, but do they also have this specialized background? Sign language interpreters do the same thing. I think I would assume CART writers, same thing.

[01:08:47] If you’re going into a legal program, do you have legal background? If you’re going into a medical program, do you have medical background? Medical transcriptionists often transition into TypeWell or CART and so that valuable access to vocabulary can really be useful for those niche settings.

[01:09:10] DANIEL: If after you go through that TypeWell training you’re pretty well suited to transcribe high school classes and then undergrad courses may be up to the senior year when things get a little more specialized, but like you said, if you do have a specialty, it is helpful, but most transcribers can go into general places and do just fine. You would want to inform someone beforehand if there was any special content.

[01:09:44] KATE: Exactly. If you’re planning an event like this one today and you know that people are going to be giving land acknowledgement or you’re going to be reading something verbatim, it’s always nice to give your provider access to those verbatim scripts or those special names that have unique spelling in advance because you can’t expect a human provider or a computer to know all of those words and pronunciation and the mispronunciation. Yes, there’s a lot to consider when you’re implementing a live accommodation, just like there are a million things to consider when you’re doing post-production work.

[01:10:26] DANIEL: Right. To jump back to that question Amber brought up earlier about abbreviations I think it’s important to look at the TypeWell transcript is really geared towards deaf and hard of hearing students and not necessarily for a screen reader to go into. It seems like that could be something that could be considered or done just that’s a small difference in output that the writer could very easily do if that’s going to open up a lot more possibilities. I wouldn’t say that abbreviating university or things like that is a standard for TypeWell. That’s a choice that that specific writer made based off of their understanding of what was going to be

useful to the person using it at the time. It’s sometimes I think very difficult because we’re talking about specific settings and then also trying to come up with ways that this could be more generally used and to have that understanding be shared so that if someone is looking at a meaning-for-meaning transcript they understand that that is going to be different than a clean verbatim or a full verbatim and to understand its value and what it is that they’ll be getting from it.

[01:11:55] KATE: Yes. Is there time for me to share a quick anecdote from my experience as a transcriber? Okay. It won’t be long. I was transcribing an MBA program in accounting for a deaf student from Korea who had a cohort of other Korean students who were hearing also in his classes. I noticed after the first couple of days of transcribing that all of his friends were sitting behind us and looking over my shoulder at the screen. They were all English language learners and they benefited from this service as well because if they missed something they could look at the screen. If there was a new term that the professor introduced that they couldn’t understand or wasn’t written on the board, they could look at my screen and verify and learn it.

[01:12:55] I also know that it was helpful for them to see how I was translating spoken English into written English. I think in terms of their language learning, it was helpful to see like, oh, that’s a different way of phrasing this or that was a really long run-on sentence, but now I’m seeing it in concise grammatically correct English sentences and their eyes were glued to the screen. I was certain that a lot of them were using it not just to verify vocabulary but also just to improve their reading ability, their English grammar skills. I found that really fascinating. There’s not a lot of research or any research that I know of, except at maybe one university about the use of TypeWell for English language learners. You’re hearing it verbatim, but then seeing it in a slightly different grammatical form I think can be really valuable for some people. I’ve also heard neurodiverse students say that they use it to anchor their attention and their concentration through the class. It’s pretty exciting.

[01:14:09] DANIEL: Yes, I think there’s a lot of places where TypeWell can be used. Outside of just the education setting, it’s definitely great for where it’s at and there’s just so many uses and potentials for it. I’m really glad we had the chance to talk about it today. I think we’re getting near the end of the time and I do want to make sure we have time for some questions and everything. Just to wrap up, what we’ve been talking about. Transcription as we saw, there’s so many different places, context, there’s a lot of different vocabulary around it. The goal here today wasn’t to standardize any of that but to really just open us all up to the different terms and backgrounds of people who use the service so that when we are deciding how to make our digital content accessible or video and audio content accessible that we know how to ask the

right questions so that we’re getting the service that’s needed for who’s ever going to be using it. Whether it’s a general audience or one specific person but that we have the vocabulary and the tools to make sure we’re doing it as best we can. Kate, thank you so much for joining me today. I’m so excited to talk about meaning-for-meaning and especially with this group and this audience that it’s been really great to be a part of the past year or so. How should we do questions?

[01:15:53] AMBER: Yes, I haven’t seen any officially come into the question thing but I wrote a few down. Maybe you guys can answer my questions. I always like the meetups because it’s like I get my personal consultation too. I was curious when I was watching the two transcripts, I noticed there was use of italics in the TypeWell transcript which was cool. I’m wondering, do you have a style guide somewhere or how would a captioner know when to use bold or italic or something like that?

[01:16:30] DANIEL: Kate?

[01:16:31] KATE: Yes, we don’t have an official style guide. We have an unofficial one that we were working on actually with one university. I do know that many transcribers have started just customarily adding emphasis to vocabulary words or references to homework or tests or quizzes or anything where they’re thinking ahead to when the student might be referencing the transcript later and using it as a study tool. They’re trying to draw attention to key points and key transitions throughout the transcript. I’m not sure where the italics were used here. Maybe they were just used to demonstrate emphasis. There’s not an official style guide and different people make different decisions about those. Also, when you’re working with an individual student, I think it’s a good practice to ask them. I know a lot of students have said I just have one of my transcribers who does the bolding and I really like it but none of the other ones do and so we say we’ll ask them to. Show them what you like because it is a human and humans can customize their output and humans can improve their skills. That’s just something that was invented by the transcribers and not really taught by us. It has become a pretty widespread practice.

[01:17:58] AMBER: I was thinking about it as you were talking about that and then the fact that they get the transcript to reference later and I was like they don’t have to take notes. I remember the binders from college, like binders. That’s how old I am. I don’t know if they use binders anymore. I was just thinking, oh, that’d be so handy to have someone sit there and take all my notes for me. You obviously have to go back through and highlight the important parts because you probably don’t want it all, right?

[01:18:26] DANIEL: Some TypeWell transcribers do bold. They make good use of those style features and like Kate was saying, if you do have that conversation with your student or the people using it you know what they’re looking for and can add that in as needed. I just want to give a shout-out to Diana, my business partner who’s been doing the TypeWell transcript today. She’s worked in almost every kind of setting and she’s just top-notch. Yes, it’s so interesting what is possible with the output and that’s also I think TypeWell is really great program because it just allows for that level of output and that level of customization so can be really helpful.

[01:19:20] AMBER: Another question I had, and I know you touched on this a little bit, like providing information in advance to captioners, but are there other things that speakers can do to make themselves easier to caption for captioners?

[01:19:36] KATE: Yes, for sure. I think I was speaking quickly because I was sensitive about time and I was also reading from notes and both. When you’re reading from notes or when you’re nervous, when you’re presenting, you tend to speak really quickly. One thing that I try to remind panelists at special events is that if you modulate your rate of speech and include pauses the way a teacher does, the way teacher uses wait time, you’re not just making it more accessible for the captionist to keep up with you, you’re making it easier for hearing people to process what you’re saying. You’re also, I think, relaxing yourself a little bit. Just packing in words can be really hard, not just on the people who are having to read it, but also on the people who are listening. That’s just one thing I can think of is pausing and speaking at a normal conversational rate.

[01:20:43] DANIEL: Yes, I would agree. Yes, I was very nervous today, so I’m sure my rate was a little bit faster. The other thing aside from pacing and speaking clearly is the technology and just making sure that you have a good microphone, good internet connection because a lot of these events are remote. You spend so much time and money and energy setting everything up, and then if you have a bad connection, the provider can’t hear everything.

[01:21:19] If the provider can’t hear, it really doesn’t matter whether it’s speech-to-text or a human provider. That setup needs to be really good if you’re wanting the output to be good so making sure that the tech is in place and working properly. It seems simple enough, but a lot of times in educational settings that’s overlooked where bandwidth is at a premium.

[01:21:46] KATE: I appreciate getting access to the slides in advance if speakers have slides, even if they’re not totally finished. People change their slides up to the very last minute. That’s

fine. Mostly, just to get a sense of what the flow is going to be like and to put any tricky words into your own dictionary so that you don’t have to trip over them when you’re typing. Just common sense things like that. In classes, often transcribers will have access to Blackboard or whatever the learning management system is so that they can get those slides. They can see the text.

[01:22:27] DANIEL: I think also just if there are going to be any pre-recorded videos in a presentation, that those be captioned and transcribed beforehand so that the captionist or the transcriber isn’t rushing to do that because very often, videos are faster paced. You don’t know if it could be advertisements. There’s a whole lot of things. If all of that can be done ahead of time, that also increases the quality of the experience for everybody.

[01:23:03] KATE: Yes, having the closed captions that are on the video screen itself is really equal access. If you’re asking a student to look back and forth between a movie and a computer screen, that’s not really equal access.

[01:23:19] AMBER: This has been a really, really great presentation. I really appreciate both of you coming. I had no idea about all the meaning-for-meaning, and I appreciate you took the initiative to suggest we get both up in StreamText, so we could look at them side by side. I took some screenshots. I think it’s been very handy, and I will definitely put those into docs that we put with the recording so that people can access those after the fact if they want to compare them.

[01:23:51] This is our last thing, how can people get a hold of you if they have additional follow up questions? Where’s the best place to reach out?

[01:23:59] DANIEL: Email, LinkedIn. I’m on LinkedIn a lot. I’m not on Twitter or Facebook.

[01:24:06] AMBER: I do think Paola put your LinkedIn in the chat, but if not, we can throw it in there.

[01:24:11] DANIEL: I think she did. Yes, I think so.

[01:24:14] AMBER: How about you, Kate?

[01:24:15] KATE: Yes, I just put my email and our website is typewell.com, and there’s a contact form, and I am also on LinkedIn, and we’re on Facebook.[01:24:25] AMBER: Great. Well, thank you so much. I really appreciate it. [01:24:29] [END OF AUDIO]

TypeWell Transcript

Amber: We are live. Hopefully, we’ll see people jumping in. I see people coming into the room right now. If you’ve just joined us, feel free to introduce yourselves into the chat. Say hello, where you are from, and anything interesting you’d like to share. That will be helpful for our panelists.

We’ll get started in five minutes.

Feel free to introduce yourself in the chat.

Note: You need to toggle in the chat to Everyone. That way, you can message everyone.

I’m seeing some familiar faces. I see Coleen, Gerson, Denise, two Jeans. Melissa has been before.

Bruce said hello from Portland, Oregon.

Christina always gives us a weather report from Ontario. [In chat.]

Brandy is saying hello from Bangor, Maine. I’ll bet there are fall leaves there.

[Reading intros from chat.]

Jean, I thought you were getting after effects of hurricanes. [Reading intros in chat, continued.]

Welcome, everyone. We’re excited to see you. We’ll start in two minutes. We are giving time for everyone to come in.

Feel free to introduce yourself in the chat. Say hello, where you are from, how you use WordPress, your accessibility background, or anything else that is interesting that bought you here today.

Last time, we had kittens on the screen, because our speaker had just gotten two new cats. For an icebreaker, people suggested names for the kittens. That was kind of fun!

We’ll start in another minute. Feel free to say hello in the chat and introduce yourself. There is a dropdown for Everyone in the chat, which you can toggle.

Jean asked what the winning kitten names were. I don’t know! We could probably tweet Glenn and ask him.

[Reading chat from Adrienne.] I definitely agree, that and toddlers crying when you are trying to do something nice for them! I’m in mom mode.

[Reading chat from Melissa.]

That’s awesome. Welcome, Melissa.

[Reading chat from Kwesi.]

Kwesi shared links.

Paula will tweet Glenn about the cats’ names. Maybe we’ll have them by the end of the meetup.

There are a few announcements. We have a Facebook group. [On screen.] [Reading URL.] That’s a great way to connect with people between meetups, share questions, get ideas, or talk about accessibility. It’s a great place to chat with people about things other Jonathan accessibility.

You can find upcoming events and past recordings. [On screen.]

This meeting is being recorded. The video will be recorded in ~2 weeks. We want to have corrected captions and a transcript first. If you can’t stay the entire time or want to catch a reply, the video will be available on our website in ~2 weeks.

If you want to get notified or get other web accessibility news, you can sign up for that. [On screen.]

If I have the meetup set properly, there will probably be an opt-in form. We send two emails/month with upcoming events, other news, and links to the recording. It’s a great way to stay up to date.

We are seeking additional sponsors. This helps make our meetups more accessible. If anyone or their company wants to help sponsor, please reach out to us. We’d very much appreciate it.

If you have any suggestions for the meetups or need any accommodations that would make it work better for you, please reach out to us. [On screen.] That will go to me and Paola (co organizer). We’ll do whatever we can to make it work well for

you.

Who am I? If you’ve been before, you probably know. I’m Amber Hinds, CEO of Equalize Digital. [Reading text on screen.]

We help find accessibility problems. It puts reports on the Post Edit screen. There is more information on our website. We are on Twitter.

We have two sponsors. One is also our speaker, which is also fun. Empire Caption Solutions has generously sponsored our transcript and srt creation so that we can have accurate captions on our recorded videos. We very much appreciate that because we were doing it ourselves.

This isn’t a secret, but I only type 40 words per minute. It was taking us all day to caption one meetup video. We really appreciate it.

They also do audio description, sign language interpretation. [On screen.] Daniel is a great guy. You’ll hear more from them later.

Daniel, I always tell people to give us a shoutout/thank you to our sponsors on Twitter to encourage them to continue. They are on @empirecaptions on Twitter.

Our live captions are being covered by Beth’s company, AccessiCart. Only the name is new, but Beth and her term have a lot of experience. [Company description on screen.] She is kindly covering the costs of our live captioner today.

You can learn more on their website. [On screen.] They are also on Twitter if you want to tweet a thank you to

them.

We have three upcoming events that I want to make you aware of. Our next meetup will be held by Colleen Gratzer, who will be talking about using InDesign to create accessible PDFs. Web accessibility includes anything you put on the web, including PDFs.

We aren’t having our normal November meetup. That’s because we are doing WordPress Accessibility Day. There will be 24 hours of talks for all different levels re: design, development, content, etc. We highly encourage you to go. Registration is free and open now.

Our next meetup will be on Monday, November 21. Alicia St. Rose will be speaking about accessibility in your content.

I’m very excited to introduce our two speakers, Daniel Sommer and Kate Ervin.

Daniel is the COO at Empire Captions. Kate Ervin is the executive director of TypeWell. They’ll be talking about different transcription types and approaches.

I’ll be watching the Q&A for any questions. Please put them in. We’ll get to those as we are able.

Dan: Thank you, Amber, for having us. We’ve been doing the captions and transcripts for about a year. It’s wonderful to be able to talk about the service. I’ve learned a lot about WordPress accessibility and other types of accessibility by being part of this. It’s great to have this as a resource.

As you said, when people are developing WordPress site and web content, all of the content must be accessible. I have found that there are many ways and methods for transcribing something.

When we first started working together, we had to spend time discussing what we were providing and what you were expecting. We had to get on the same page.

Today, I hope to go over some of the more nuanced aspects of transcription. I’ve invited Kate to talk about meaning-for meaning. There are many wonderful uses for that.

Kate and I have been working together for ~10 years. She is the executive director for TypeWell and a head trainer and expert on the subject. We’ll be talking more about meaning-for-meaning towards the end.

We’ll start by looking at transcription in general. I’ll bring up a slideshow. Chat any questions you have. Amber or Paola will get them to me.

Thank you to the captioner who is doing them normally. Later, I’ll be introducing a live meaning-for-meaning transcript. We’ll give that more context first.

Kate, would you like to say anything first?

Kate: No. You dive in. I’ll join you later.

Dan: Great. Let’s get this screen going. Excellent.

I’m from New York. Before I got into transcription, captions, and accessibility, I was a professional classical singer in New York. McSorley’s is one of the oldest bars in New York. There, you go in and ask for light or dark.

They only have two options for beer. You know what you are going to get. It’s clear from the beginning if it’s the right drink or not.

They’ve mastered their brew. It’s a consistent experience every time.

If you go down the block to Blind Tiger, they offer 30-40 types of beer on tap in addition to bottles and more. If you ask them for a dark beer, you’ll be bombarded with a lot of questions.

What do you drink? How can I give you what you want? At McSorley’s, you’ll get what you want.

Why do I bring up beer? Often, I have conversations with people, and we think we are being clear.

I want a transcript. What does that mean? Who is using it? Will it be effective for the users? When are they getting it? Where are they getting it? How are they accessing it?

These are important questions. We can’t always have a simple conversation and ask for the light or dark one. These conversations are more nuanced. That’s what I want to talk about today.

What do we mean by transcription? Is this musical transcription? Is someone dictating to us? Is it referring to something biochemical, genetic, numeric, phonetic as with speech pathologists? How do we transcribe sound?

Transcription and transliteration are often used interchangeably. Transliteration is taking language that is non-

Roman and putting it into the Roman alphabet.

Today, we’ll talk about lexical or verbatim and putting spoken words into a transcribed format. We are taking something from one medium and putting it into another. With lexical/verbatim, we are taking spoken words or sounds and putting them into a written format.

This is where things can get a little complex. This is why having the conversation is so important to understand and also to understand that so many words can come up with us. At the end of the day, we are only talking about a few things, but I’d like to look at some of the terms we see.

[Reading words on screen.]

There is still some disagreement about CART.

[Reading list, continued.]

If you are working in the education realm, certain terms and vocabulary words are used. Different words and terms are used with localization firms, marketing firms, etc. It really depends on the context which will determine which of these services you really need and which will be effective in those settings.

I was going to talk about this a little later, but I wanted to bring this up, because we have other people working behind the scenes who I want to introduce.

Are transcripts and captions the same?

If you turn on the CC on the Zoom, the captions will appear instantly. If you enable the live Zoom transcript, it will be slightly delayed. The format, layout, and timing make the

biggest distinction.

When talking about transcripts and captions, both contain the spoken text and words being heard in the audio or video. Transcripts contain the entire text, usually in paragraph form.

We’ll go over good practices. The reader will have access from the start to end time and can see what happened before.

Captions are displayed on the screen with the video for a set period of time and then disappear. The context is the same, but our experience and way of digesting it is different.

One is immediate. We have to stay locked in. If we miss it, it’s gone. It’s like the audio experience that you can’t rewind.

Transcripts record the whole event. You can go back and view them, even during the presentation.

There are a few different types of transcriptions. Typically, we are used to verbatim, which means all of the words are captured or displayed. You are seeing that today when you click the CC button. The words show up. It’s great.

A while ago, Meryl Evans gave a presentation on closed captions. She insightfully said that not everyone likes CC. Some people prefer the transcript versus seeing it for a set period of time and then moving on.

Today, we are giving access to two types of transcription:

1. Verbatim. That’s like what you are seeing in the captions in paragraph form. Paola has the links.

2. TypeWell or meaning-for-meaning. I’m joined by my business partner, Diana Lerner, who is doing that. It’s much more concise and geared for educational settings.

I wanted you to have access to both so you can bring them up and see the differences. We’ll go into those differences now.

I want to thank the captioner, Diana, for doing that. Feel free to put them side-by-side to see the difference.

Is transcription a form of translation?

That’s a really good question. If we look at the Oxford English Dictonary (OED), translation is defined as the conversion of one form or medium to another. In that sense, we are taking an auditory (heard) experience and translating it into a visual experience. There are different things we do when we are experiencing something auditorily vs. visually.

Often, when we are engaged in conversation or listening to a presentation, especially if there is more natural speech, there are more fuller words such as: Um, huh, you know, like, etc. Often, we can filter that out and hear the core concepts being shared and the main message. If we didn’t hear the fillers, it wouldn’t affect our understanding so much.

We don’t want that auditory noise that we often filter out to be present visually. Are we trying to create a transcript more suited for reading and the page? If we are translating from one experience or medium to the other, what are the important things we need to consider, and how can we make sure we are providing equal access to the people who need it?

Wherever the transcript is going to be, what are the core parts that will make it usable and accessible and give people access to the information they need?

1. Who is speaking? We need to know this. Often, if there are captions, we see the text and person, and we know. If we are only experiencing the transcript without the audio, or we are relying on the transcript for the information, we need to know who is speaking.

2. What are they saying? What’s being heard in that audio?

3. Any relevant environmental sounds. These are sounds that occur that are important to understanding what’s happening.

To go more into the nuance of speaking, it’s important to include a name or title if it’s known. We want to avoid gender or appearance identifiers. Unless we know for certain, it’s difficult to assign someone’s gender or identify them based solely on appearance. Unless you’ve agreed upon that, avoiding gender or approach is preferred.

Indicate when speaker changes. This should be clearly indicated.

What are they saying? How do we accurately capture a speaker’s message? We’ll talk more about accuracy in a bit.

It needs to be an accurate representation of what was said in the written format. It needs to use proper writing conventions so readers can follow along.

Where it’s slightly different from captions is that we have more control over the layout of the page and how easily we set it up for people to read and digest the information. Are the paragraph breaks clear? Is the page laid out such that we can easily consume the information?

The last part, and this is important and more subjective than the other parts, is What is a relevant environmental sound? What will add important information to the reader?

As a quick side, early on, I was typing for a live class. My student didn’t hear that there was a fire alarm. There were no lights or visual cues to indicate there was a fire alarm. No one tapped this person on the shoulder to tell them there was a fire alarm.

If people are looking at a transcript for access to real-time, these sounds are super-important. They have real-life consequences in addition to sharing the message. There are sounds we make that are not words but are part of the communication such as laughing, applause, etc. [On screen.]

It’s equally important to note if something is indiscernible for inaudible. If it can’t be heard, it doesn’t make sense to try to fill in the gaps. That doesn’t accurately represent what’s in the audio.

Changes in speaker’s tone or manner of speaking is subjective. If someone changes their tone or manner, such as speaking in falsetto with a high voice, that’s important to indicate. It’s important to not interpret this, but accurately describe what is happening in a way that is not judging or assuming what’s happening. This is more nuanced.

Now, I’d like to look at three different transcripts. I’m going to play the audio at the end. I’m going to put them on the screen one-at-a-time.

We’ll start with this one. It’s the opening to a meeting. I’ll leave this up for 30 seconds or so. Take time to read it.

Amber: I think that we should read it, because we have someone on this call who can’t see the slides?

Dan: Can I read it in a minute?

Amber: Yes. We can pause. We want to make sure we read the section you want to highlight.

Dan: Absolutely. Thank you.

[Pause for participants to read screen.]

Three accurate transcripts – #1.

[Dan reading: TERI: I’d like to welcome . . . I will just continue to go along.]

Dan: I’ll stop there.

Go to the next one. We’ll do the same. I’ll leave it up for 30 seconds. Then, I’ll read it.

[Pause for participants to read screen.]

Dan: Three accurate transcripts – #2.

[Dan reading: TERI: I’d like to welcome . . . I will just continue to go along.]

There is one more paragraph that I’m skipping.

Onto the last one. I’ll leave it up and then read.

[Pause for participants to read screen.]

[Dan reading: TERI: I’d like to welcome . . . I’ll continue to go along.]

Dan: All three of those were transcripts of the same audio and event. I’m putting them up here side by side. I’ve tried to make the text as readable as possible.

These transcripts came from this one audio. I’d say they are all accurate and convey the same information. After reading each of them, you walk away with the same understanding of what will be happening during the presentation.

What’s different? I used to teach and was waiting for hands to go up. I’ll answer my own questions! [Laughing]

In #1, all of the filler speech and extra speech is included which are the um’s. For me, it’s clear in this verbatim transcript, but it disrupts the flow of the reading in many ways. If I were to look at this transcript apart from the audio, it’s a little more difficult.

The filler speech has been removed from #2. The sentence flow is much closer to a written format or script. In many ways, this sets up the reader to digest the information more easily and without transcribing that auditory noise into the written and making that visual noise. The translation of the auditory noise is not included in this, but it increases readability and how fast someone can go through and get this information.

In #3, there are a lot fewer words on the screen but no less information is being shared. We don’t have last names, but we know who we are talking to. It’s more concise. A reader will have equal access to the information and experience that is happening.

This is where I’ll bring in Kate, but this last one is meaning for-meaning transcription.

The one in the middle is clean verbatim. Everything said is captured, but the filler noise is omitted.

The first one is full verbatim with everything.

Amber: We had a few things in the chat that I thought might be interesting. One person commented that the um‘s were making them tired in #1.

There was a question about #3. The has a lot of abbreviations. They were wondering if this interferes with accessibility.

Dan: In terms of screen reading?

Amber: I have thoughts. I don’t know if you have thoughts.

Dan: Good question and observation. I think a lot of that will be cleared up when I bring Kate on and we talk about meaning for-meaning and the context it’s used in. That’s a great point. Can we come back to that? We might cover it. If not, we can come back to it.

Amber: Someone asked if it’s necessary to include the comment about Stacy not being there.

Dan: Yes. She said that Stacy isn’t there yet. It’s an important piece of information.

You bring up a great point. In many ways, we are saying this needs to be a verbatim, accurate transcript for what is being

said. At some point, we need to make decisions about what to include and exclude. Those are good questions.

I don’t know if Stacy is here yet. That seems to be a main point. Stacy will be presenting. I don’t see her here. We can expect to see her later.

It’s the job of the transcriber to decide what’s important. Ideally, we want to capture as many of the main points and information that is shared as much as possible.

I don’t hear her could have possibly been left out.

Once I talk more about meaning-for-meaning and bring Kate into the conversation, a lot of those questions will be cleared up. People are least familiar with meaning-for-meaning and TypeWell, which is geared for specific uses and context. I think it’s important to look at all of those questions and considerations.

Did that answer that?

Amber: I think so. That’s all of the questions for now. I’ll pop back in if I see others.

Dan: Those are great questions.

What is meant by “accuracy”? How can we make sure that people are getting equal access to information? I think we addressed quite a lot of that.

When we are talking about accuracy, are the words spelled correctly? Also, is what is captured representing what is said accurately? Is punctuation accurate?

There are many ways of measuring accuracy. A lot of it depends on the context. Where is this being used? How is this being shared?

It’s important to know if you are looking for clean or full verbatim or meaning-for-meaning. You need to know what you are measuring. If you are doing this yourself for recorded videos or audios, you need to know how much time you want to spend. As Amber said in the beginning, you can spend a whole day depending on how detailed you want it to be and the accuracy level you want to provide for your readers.

That’s to say that I don’t have a clear answer for what is meant by accurate. You have to understand what services you are requesting and that they match what you are expecting and paying for. If you do it yourself, you need to understand the criteria for accuracy.

Before we leave this, I want to play the minute-long audio. You can pick which one you want to look at. They’ll all be here.

[Audio playing of text on screen. Note: This audio is transcribed as full verbatim in #1]

Dan: That was so you could have access and hear the audio that these transcripts were based on.

Now that we have a basic understanding of what transcripts and transcription is, I’d like to go through the specific settings. I’ll do this fairly quickly so we can bring Kate and then get to your questions.

For pre-recorded media, the type of transcription is typically full or clean verbatim. These are the terms you’ll see most often, although other terms are used. You’ll be considering this for any audio or video project.

In terms of delivery, you expect some type of text document. [On screen.]

It can be static, but we are seeing more and more interactive transcripts. I’ll show a quick example of that. I’ll see if I can get a new share here. This is a quick example.

[Demonstrating interactive transcript.] [On screen.]

The words are highlighted one-by-one as they are spoken. In some ways, this is timecoded and requires another step. This is becoming more and more popular. People can click the transcript and go through and it be brought to the space in the audio or video.

I’ll switch back.

How do we go about this? This is automatic speech recognition (ASR) and tons of tools that do that. If you use ASR tools, you’ll want to check the quality of that and accuracy of what’s being spoken. It’s getting quite good, especially if the quality of the audio is clear and the mics used are high quality. Many people are now doing ASR plus human editing. There are tools for that.

You can use a QWERTY keyboard or stenography.

Voice writing is becoming more popular. This is different from dictation or using the Google speech-to-text tool in a Word or Google doc. When they are trained well, it can be at the same level of stenography at the same speed.

Then, we have real-time, which we are doing now. The user sees the text. It’s in real time. The providers need to have very high skills and be able to capture the speech in real-time which a one- to three-second delay. The skills and tools used are very

different.

We have verbatim. Usually, when we are talking about live transcription, clean verbatim is used. Full verbatim is less common.

We also have CART and meaning-for-meaning. Kate will talk more about meaning-for-meaning in a moment.

Unlike recorded media, where you can just give a text document that people can read with or without the audio, with live transcription, a few players are used to share the text in real time. We are using StreamText today. Paola provided the links for that.

Almost all transcribers who provide real-time services use StreamText or 1CapApp. These are broadcast almost instantly. I believe StreamText is supported by screen readers. I’m not sure about 1CapApp.

Zoom has an option. It’s slightly delayed but not because of what the providers are doing. The captions tend to appear in real time.

Methods are similar. Many people use ASR. It’s better than nothing, but often, there are inaccuracies that are embarrassing or give the very wrong message.

Stenography is the classic.

Voice writing is becoming more popular. One of the interesting possibilities of voice writing is the potential for live transcription in other languages, which is currently very difficult. There are very few people who do it in French. More do it in Spanish. Beyond that, it’s difficult.

TypeWell and C-Print do meaning-for-meaning.

Now, Kate will talk about TypeWell. We’ve worked together for almost 10 years. I started out as a TypeWell provider, and that’s how I got into this industry.

Kate became the executive director a few years after I got involved. She’s been great at keeping the company focused and delivering excellent training to the providers and ensuring that TypeWell remains high quality.

It’s different from verbatim that people are used to. Kate will talk about this. Thank you, Kate, for being here and joining me and for all the great work you do at TypeWell. I’m excited to share more about TypeWell with this audience.

Kate: Thank you. I want to acknowledge that I’m from Tucson, the ancestral land of several Native tribes. I’ve pasted those names I those names chat.

I’m a white cis woman. I’m wearing glasses and a blue shirt.

TypeWell is provided by a human transcriber, but we train our transcribers to transcribe spoken English into clear, concise English using fewer words.

Meaning-for-meaning is a term used to distinguish word-for-word. It’s hard to describe until you see it.

Transcribers intentional edit in real time. Usually, the sentences are relatively short. The goal is understanding the speaker’s intended message.

Spoken English is messy with false starts, fillers, etc. There are many fragments, run-on sentences, etc. That’s how we speak. As hearing people, we adjust to that, but that’s not how we

read.

TypeWell was designed with DHH students in mind, especially high school and college students in the classroom reading a transcript in real time.

They are in classes all day. Their eyes get fatigued. They have to access many things throughout the day.

Their reading levels may be lower than that of their peers. Also, account for everything else they are taking in and their desire to participate in class.

Our transcribers make intentional use of white space and make sure it’s not a huge block of text. This is also for topic organization. We expect students will scroll back or review something.

Students will rest like all other students and will need to scroll back. They wouldn’t have that access if the text disappears.

TypeWell transcribers also provide a copy of the transcript for students to use. That’s a secondary aspect.

You can imagine that a TypeWell transcriber is making thousands of decisions about which words to include, exclude, and add such as words that were implied. How do you rearrange the words, insert white space, include nonverbal cues, etc.?

TypeWell providers are human and have limits. Making these decisions in real time can be exhausting. The transcriber must condense a message if a speaker is speaking 400 WPM.

We are making trade-offs. I think humans can do meaning-for meaning, real time editing much better than computers, but there

can be mistakes. You’ll see a variety of skill levels and a range of experience and a range of quality, like with others services.

There are factors at play such as speed of audio, background noise, crosstalk, etc.

TypeWell was originally developed to be used by providers who could be trained quickly and be able to use their laptop and QWERTY keyboard. The original developers wanted to make it accessible to those transcribers by keeping their training relatively short and straightforward and accessible to the schools by keeping the software licenses relatively affordable.

What we are doing is very useful, not only for DHH students. There is no monolith or one-size-fits-all solution. Sometimes, you are accommodating an individual. Other times, you are using universal design.

When you are choosing between these different captioning or transcription methods, you need to consider the audience and accommodation requirements and affordability. We are accounting for people’s hearing ability, reading level, sightliness, processing speed, attention, cognitive abilities, and all of that.

That’s my intro to TypeWell and meaning-for-meaning and where it evolved out of. I’m excited to think and learn about other places where it might be useful and applicable.

Dan: Thank you, Kate. One thing that I think is very special about TypeWell is the math mode and not just do transcription of what is said but to be able to do advanced math, physics, and equation. I’m trained as a musician and can count to four! [Joke/joking.]

That seems to be an important component of TypeWell in the

educational space.

Kate: Yes. That relates to the question in the chat about the backgrounds that TypeWell providers come from. Some colleges train students to be providers for their peers in college in addition to taking their own classes.

To go through our training, you just need to be a fast typist and have strong English skills. We are not keeping anyone out of the field necessarily.

If you have a math/science background, we’ll encourage you to use our math mode by which you can embed equations and scientific notations. That takes skill and is tricky, but we’ve developed hands-on training.

If you are going into a technical class, it helps to have background knowledge in that area. If you are a coordinator, check if they have a TypeWell qualification and background in that area. This would apply to subjects such as legal, medical, etc. Often, medical transcriptionists transition into TypeWell. That background can be valuable for these niche settings.

Dan: After you’ve done the TypeWell training, you are suited to transcribe HS and undergrad courses. It’s helpful if you have a specialty, but most transcribers can go into general places and do fine. You’d want to inform someone of general content.

Kate: Yes. If you are doing something like land acknowledgments or will be reading something verbatim, it’s helpful to provide that information. You can’t expect a human provider to know all of those words and pronunciations or mispronunciations. There is a lot to consider just as with post-production work.

Dan: Like Amber brought up with abbreviations, the TypeWell transcript is geared to DHH students and not necessarily for a screen reader to go into, but that’s something that can be done if it will open up more possibilities. Specific writers make

specific choices re: abbreviations and consider how useful it will be the reader.

Sometimes, we are talking about specific settings and also how this can be generally used and have that understanding be shared so that someone looking at a meaning-for-meaning transcript will understand that it’s different from a clean or full verbatim transcript and understand its value and what they’ll be getting from it.

Kate: Do I have time to share a quick anecdote?

Dan: Yes.

Kate: I was transcribing an MBA program for a Deaf student who was from Korea who had hearing students in his classes. I noticed that his ELL cohort was looking at the screen. They benefited from this service as well, because they could look at the screen if they missed something. They could verify new terms and concepts by looking at my screen.

I also know that it was helpful for them to see how I was transcribing spoken into written English. They could see concise, grammatically-correct English sentence. Their eyes were glued to the screen. Not only could they verify vocabulary, but they could use it to improve their English skills.

You are hearing it verbatim but seeing it in a slightly different grammatical form. That can be useful.

I’ve heard that neurodiverse students use it to anchor their attention in the class.

Dan: There are other places that TypeWell can be used beyond the educational setting. There are many uses and potential for it. I’m glad we could talk about it today.

We are getting near the end of the time. I want to make sure we have time for questions.

To wrap up, as we saw, there are many places and contexts and vocabulary around transcription. Today’s goal wasn’t to standardize. It was to discuss terms and help you make decisions for making content accessibility and knowing what questions to ask for different audiences. I wanted to make sure we have the vocabulary and tools to do this as best as we can.

Thank you, Kate, for joining us today. I’m so excited to talk about meaning-for-meaning, especially with this group and audience. It’s been great to have been part of this for the last year or so.

Any questions?

Amber: I don’t see any, but I wrote a few down. I like meetups because I get my personal consultation too.

[Laughing]

Amber: I noticed used of italics in the transcripts, which is cool. Do you have a style guide? How would a captioner know to use bold or italics?

Kate: We don’t have a style guide, but I’ve noticed that transcribers have been using italics for terms and bold for things like homework. They are trying to draw attention to key points and transitions. I don’t know where the italics were used here. Maybe they were for emphasis.

There is not an official style guide. I know it’s good practice for users to ask their transcribers. That was invented by the transcribers and not taught by us. It’s became a pretty

widespread practice.

Amber: I was thinking about that and thinking, “They don’t have to take notes!” [Laughing] “That would be so handy to have someone sit there and take all of my notes for me.” You probably have to go back and highlight what you want.

Dan: Some transcribers make good use of style features. If you speak with your student, you’ll know what they are looking for.

I want to shoutout Diana, my business partner, for doing the TypeWell today. She’s worked in almost every kind of setting and is top-notch.

It’s interesting what is possible with the output. That’s another reason I think TypeWell is so great. It allows for that level of output and customization.

Amber: You talked about providing information in advance to captioners. Are there other things that speakers can do?

Kate: For sure! I was speaking quickly because I was sensitive about time. I was also reading from notes. When you are reading from notes or are nervous, you tend to speak quickly.

I tend to remind panelists that if they modulate their rate of speech and use pauses, they are making it easier for hearing people to process what they are saying. You are also relaxing yourself a little bit. Packing in words can be hard for both the readers and listening.

Dan: I agree. I was very nervous today. I’m sure my rate was a little bit faster.

The other things aside from pacing and speaking clearly is the technology and making sure you have a good mic and internet

connection. You spend time and energy setting everything up. If you have a bad connection, the provider can’t hear.

The set-up needs to be good if you want the output to be good, so make sure the tech is in place is working properly. That seems simple enough but can be overlooked in educational settings, and bandwidth may be at a premium.

Kate: I appreciate getting access to the slides in advance even if they are not totally finished. This gives a sense of the flow and allows you to put tricky words into your TypeWell dictionary.

In classes, transcribers often have access to Blackboard so they can get the slides.

Dan: It’s important that pre-recorded videos be captioned or transcribed beforehand. Often, videos are faster-paced. If that can be done in advance, that increases the quality of the experience for everyone.

Kate: Yes. Having closed captions on the video screen is equal access. It’s not equal access if they have to look at the video and a computer screen.

Amber: This has been a great presentation. I appreciate your coming. I didn’t know about meaning-for-meaning. I took screenshots and will include them in the recording.

Where is the best place for people to reach out?

Dan: Email and LinkedIn. [In chat.]

Kate: I just put our email and website in chat. I’m also on LinkedIn. We are also on Facebook.

Amber: Thank you. We’ll sit quietly before I close the meeting. If I close it too soon, we lose the end of the transcript. We’ll sit quietly for a second. Then, I’ll actually close. Thank you.

[End of meetup.]

Links Mentioned

About the Meetup

The WordPress Accessibility Meetup is a global group of WordPress developers, designers, and users interested in building more accessible websites. The meetup meets twice per month for presentations on a variety of topics related to making WordPress websites that can be used by people of all abilities. Meetups are held on the 1st Thursday of the month at 10 AM Central/8 AM Pacific and on the 3rd Monday of the month at 7 PM Central/5 PM Pacific.

Learn more about WordPress Accessibility Meetup

Article continued below.

Stay on top of web accessibility news and best practices.

Join our email list to get notified of changes to website accessibility laws, WordPress accessibility resources, and accessibility webinar invitations in your inbox.

Email(Required)

Name

First Last

Comments

This field is for validation purposes and should be left unchanged.

Summarized Session Information

The session, led by Daniel Sommer and Kate Ervin, explored the nuances of transcription services, focusing on how audience, purpose, and context shape transcription outcomes. Key differences between transcripts and captions were outlined, emphasizing the flexibility of transcripts versus the immediacy of captions.

The presentation also covered transcription accuracy, discussing elements like speaker identification, content representation, and environmental sounds. Some examples highlighted full verbatim, clean verbatim, and meaning-for-meaning approaches, showcasing their varied applications.

TypeWell’s meaning-for-meaning transcription is designed to condense spoken language into concise, readable text. This approach is particularly effective in educational settings and for diverse audiences, including English Language Learners and neurodiverse individuals.

The meetup concluded with practical advice for tailoring transcription methods to specific needs, reinforcing the importance of clear communication with transcription providers.

Session Outline

Exploring transcription: an overview
Understanding transcription accuracy and its components
Introduction to meaning-for-meaning transcription
Practical applications and audience considerations
Closing thoughts and recommendations

Exploring transcription: an overview

Here’s an analogy between beer options at two bars: McSorley’s, with its simple light or dark choices, and Blind Tiger, which offers a vast selection requiring further clarification. This is why we need precise communication when requesting transcription services. Simply asking for a “transcript” is often not enough; it’s crucial to specify the audience, purpose, and context to ensure the final product meets expectations.

Key considerations in transcription

Purpose: What is the transcript intended to achieve?
Audience: Who will be using it, and what are their specific needs?
Access: How and where will the transcript be accessed?

The term “transcription” can encompass a range of meanings, from musical notation to genetic encoding. In this session, the focus was on the lexical or verbatim transcription of spoken words into written form. There’s a difference between transcription and transliteration, the latter involving the conversion of non-Roman alphabets into the Roman script.

Transcripts vs. captions

The fundamental differences between transcripts and captions:

Transcripts: provide the entire spoken text in paragraph form, offering flexibility for users to revisit content.
Captions: appear on-screen during a video, synchronized with the audio but fleeting, requiring immediate attention.

Captions mimic the auditory experience, where missed information cannot be easily retrieved, while transcripts allow users to navigate the content at their own pace.

Understanding transcription accuracy and its components

Transcription accuracy extends beyond correct spelling and grammar. Accuracy also involves faithfully representing the speaker’s intent, including contextual and environmental elements. These are three core elements to consider:

Speaker identification: clearly marking who is speaking is essential, especially when the transcript is the sole source of information.
Content representation: capturing what is being said accurately and in a manner that retains the original message.
Environmental sounds: including relevant auditory cues, such as a fire alarm or applause, that contribute to the understanding of the context.

Examples of accuracy in practice

During the presentation, Daniel played an audio and presented three transcripts derived from the same clip:

Full verbatim: included every spoken word, including filler phrases like “um” and “you know,” which disrupted the reading flow.
Clean verbatim: removed filler words to enhance readability while retaining all substantive content.
Meaning-for-meaning: condensed the content further, focusing on the essence of the message without losing critical details.

Comparing these examples, we can see that all three were accurate but varied in their utility depending on the context and audience.

Introduction to meaning-for-meaning transcription

Meaning-for-meaning transcription is the hallmark of TypeWell’s methodology. Unlike verbatim transcription, which captures every word, meaning-for-meaning focuses on conveying the speaker’s intended message clearly and concisely. This method is particularly beneficial in educational settings, where DHH students may need to process large volumes of text in real time.

Key features of meaning-for-meaning transcription

Conciseness: intentional editing of spoken language into readable written text.
Readability: use of white space and short sentences to prevent fatigue.
Flexibility: inclusion of implied words and reorganization of phrases for clarity.

TypeWell transcribers must make thousands of decisions in real time, balancing accuracy with readability. This process can be demanding but results in transcripts that are highly accessible to a wide range of users.

Applications beyond education

While originally designed for DHH students, the are broader applicabilities of meaning-for-meaning transcription. For instance, MBA classes for a Deaf student whose English Language Learner (ELL) peers also benefited from the clear and concise text. Meaning-for-meaning transcription can support diverse audiences, including neurodiverse individuals and those learning English as a second language.

Practical applications and audience considerations

Choosing the right transcription method requires careful consideration of the audience and the context. Factors to evaluate include:

Purpose: whether the transcript is for academic, professional, or general use.
Audience: accounting for cognitive abilities, reading levels, and accessibility needs.
Cost and Tools: eeighing the affordability and effectiveness of methods like stenography, voice writing, and ASR tools.

Example scenarios

While a strong grasp of English and fast typing skills are essential, TypeWell transcribers can benefit from specialized knowledge in fields like math, science, or medicine. This expertise enhances their ability to create accurate and meaningful transcripts in technical settings.

Closing thoughts and recommendations

It’s important to align transcription services with the audience’s needs. You should engage in detailed discussions with transcription providers to ensure their expectations are met. Some practical advice:

Preparing materials in advance for transcribers.
Using high-quality microphones and ensuring stable internet connections.
Modulating speech pace for clarity and ease of transcription.

Transcription Types for Real-time Settings and Recorded Media: Daniel Sommer

Thanks to Our Sponsors

Watch the Recording

Read the Transcript (Clean Verbatim vs. TypeWell)

Links Mentioned

About the Meetup

Stay on top of web accessibility news and best practices.

Summarized Session Information

Session Outline

Exploring transcription: an overview

Key considerations in transcription

Transcripts vs. captions

Understanding transcription accuracy and its components

Examples of accuracy in practice

Introduction to meaning-for-meaning transcription

Key features of meaning-for-meaning transcription

Applications beyond education

Practical applications and audience considerations

Example scenarios

Closing thoughts and recommendations

Easier, Faster Accessibility Testing

Company

Services

Accessibility Checker

Thanks to Our Sponsors

Watch the Recording

Read the Transcript (Clean Verbatim vs. TypeWell)

Links Mentioned

About the Meetup

Stay on top of web accessibility news and best practices.

Summarized Session Information

Session Outline

Exploring transcription: an overview

Key considerations in transcription

Transcripts vs. captions

Understanding transcription accuracy and its components

Examples of accuracy in practice

Introduction to meaning-for-meaning transcription

Key features of meaning-for-meaning transcription

Applications beyond education

Practical applications and audience considerations

Example scenarios

Closing thoughts and recommendations

About Equalize Digital

Easier, Faster Accessibility Testing

Footer

Company

Services

Accessibility Checker