Doki Timing Guide

Note: There are many different ways of timing. This guide is a step by step explaining how I time. In my opinion, my method is the most efficient method.

Contents:

  1. What is timing?
  2. Shortcuts
  3. Pass file generation
  4. Loading the subs, video and audio in Aegisub
  5. Timing
  6. Scene bleeds
  7. TPP
  8. Random Misc Tips
1. What is timing?

Many times I have asked a new recruit: “What is timing”? And their response: “Matching the subtitles to the audio”. While this is not an incorrect response, if that was the only thing you did, you would fail as a timer.
Timing is about matching the subtitles to the audio, but at the same time, making sure that the viewer can read it properly. If the viewer has to pause and rewind to read the subtitles, then the timer has failed.
Timing is easy to learn, in my opinion. It is a highly mechanical and repetitive task. You can either get it right or wrong. There is no grey area, unlike other part parts of fansubbing, like editing.
Before we continue, we have to define some terms that I will be using throughout this tutorial.

  • Lead in – The time from the start of the subtitle to the start of the audio.
  • Lead out – The time from the end of the audio to the end of the subtitle.
  • Keyframe – A keyframe denotes a scene change, most of the time.
  • Adjacent lines – 2 subtitle lines which are directly next to each other.
  • ms – milliseconds, (1/1000 of a second).

Diagram illustrating what I just said.

In Doki, I have set down certain numbers that all timers must use, for consistency reasons.

  • Lead ins of ~125 ms
  • Lead outs of 500ms
  • If there is a keyframe/scene change within 500ms of the end of a line in either direction, change the end of the line to the keyframe.
  • If there is a keyframe/scene change within 250ms before the start of a line, start the line at the keyframe.
  • If the start of the adjacent line is within 500ms of the end of the previous line, end the previous line at the start of the next line.

This means that:

  • The maximum theoretical lead-in is ~375ms (250ms + 125ms).
  • The maximum theoretical lead-out is 1000ms (500ms + 500ms).

Here are my golden rules of timing.

  • MAKE SURE YOU ALWAYS TIME TO A .MKV VIDEO, NEVER A .MP4. I cannot stress this point enough. The frames are incorrect for .mp4 video, so when you mux, all your lines are 1 frame behind. I learnt this the hard way. This is the reason why every episode of Ladies versus Butlers have a v2. X3OY was not impressed with me. He had to make 12+ patches lol.
  • MAKE SURE YOU GENERATE A PASS FILE BEFORE YOU TIME. Keyframes are inaccurate. Keyframes are supposed to mark a scene change. However, some scene changes are not marked with a keyframe, and sometimes there are keyframes where there is no scene change. The pass file has 99% “correct” keyframes. You still have to watch the video with your timed subs to catch that 1% error.
  • “WHEN IN DOUBT, MORE IS BETTER THAN LESS.”
    There might be a lot of numbers to memorize. However, if you follow that rule, you’ll be fine. For example, if you are pondering whether to give a line more lead-out or not, just give it.
  • USE KEYBOARD SHORTCUTS. They save you time, and time is money.Although I said that the maximum theoretical lead-in is ~375ms, following the “when in doubt, more is better than less” rule, you could have anything up to 500ms of lead-in and it would be fine. Same applies to lead out. You could go over the theoretical maximum of 1000ms. Anything up to 1200ms is fine.
2. Shortcuts

Learn these.

  • Left mouse click = Start line.
  • Right mouse click = End line.
  • Q = Play 500ms of the audio before the start of the line.
  • W = Play 500ms of the audio after the end of the line.
  • D = Play the last 500 ms of the line.
  • S = Play the current line from start to end.
  • G = Commit all changes.
  • C = Add lead in. (125 ms)
  • V = Add lead out. (500 ms)
  • Shift + Right Mouse Click = End the line and snap to nearby keyframe.

Default lead in and lead out are different. You have to set them to 125ms/500ms. Click on View, Options, Audio.

You can actually change the shortcuts in Hotkeys. I have changed my “Q” into a “A” because that is more comfortable.
You should adopt a “home position” with your left hand before you start timing. (Your right hand should be holding the mouse.)

  • Fifth finger should on the shift key. This is for Shift+right clicking for snapping end of lines to the keyframes.
  • Fourth finger should be on the “A” key. This is for playing 250ms of the audio before the start of the line.
  • Middle finger should be on the “S” key. This is for playing the current line from start to end.
  • Middle finger can also press the “W” key when necessary. This plays 250ms of the audio after the end of the line.
  • Index finder should be on the “G” key. This is for committing your changes and going onto the next line.
  • Index finger can also press the “V” key when necessary. This is for adding 500ms lead out.
  • Your thumb can be on the spacebar. This has the same function as “S”; it plays your current selection.
3. Pass file generation

This is related to my Golden Rule #2: MAKE SURE YOU GENERATE A PASS FILE BEFORE YOU TIME. Keyframes are inaccurate. Keyframes are supposed to mark a scene change. However, some scene changes are not marked with a keyframe, and sometimes there are keyframes where there is no scene change. The pass file has 99% “correct” keyframes. You still have to watch the video with your timed subs to catch that 1% error.
Before you can create a pass file, you have to install the following programs.

  • Avisynth – This allows .mkv files to be loaded into VirtualDub.
  • VirtualDub – This baby generates your perfect keyframes.
  • xvid_encraw – This generates the pass files (it’s what VirtualDub uses)

Update: Here’s a new, more convenient solution where you don’t need VirtualDub at all. You need SCXvid-standalone and FFmpeg.

Setup: Press Win+R, type “shell:sendto” and save this script there. Now, either edit the script and add the path to your scxvid.exe and ffmpeg.exe (line 11) or add those to your path.

Usage: Right click your .mkv file, go to “Send to” and select “Create Keyframes.bat”, wait till it finishes >_>

If you want to use VirtualDub instead, follow the steps below.

You must follow the steps below, otherwise I can’t guarantee that it will work.

Step 1: Get ready

Have your workraw ready. I will assume it is in E:workraw.mkv.

Step 2: Making the .avs file

Open Notepad or another basic text editor. I use Notepad++.
Write this in:
DirectShowSource("E:\workraw.mkv")
Save this as E:\workraw.avs

Step 3: Making the pass file in Vdub

Open VirtualDub.exeFile -> Open video file… Select your workraw.avs (E:workraw.avs). If you have done this correctly, your workraw should be loaded into Vdub.

Video -> Compression… -> Select the XviD codec (Last item on the list) -> Configure
Note: If you don’t see the XviD option, you will once you install K-Lite Codec Pack.

Encoding type: Twopass – 1st pass -> More

Check Full quality first pass -> OK -> OK -> OK

File -> Run video analysis pass

A progress box will appear. When it is complete, you should have E:video.pass, which is the pass file with all the correct keyframes in it.

4. Loading the subs, video and audio in Aegisub

Before we start timing, we have to load everything into Aegisub. You can download Aegisub here. Make sure you’re using the latest version.

Step 1: Get ready

You need to have your workraw, your untimed subs, and your pass file ready. I will assume everything is in E: once again.

Step 2: Open subtitles in Aegisub

Your untimed subs probably came in a .txt file. Aegisub can read .txt files. If your subs came in a .ass file, that’s fine too.

Step 2a: Resample the script’s resolution.

How to resample the resolution

Here at Doki we work on all our scripts at 1280×720; so the first step when you start a new script is to change the resolution. To do this, select the button with the two overlapping squares from the top toolbar as demonstrated in the picture above, enter 1280×720 and press OK to save the change. Although this change doesn’t have an impact on your timing right now it’s good to get this done at the start before it gets to the stage where it could.

Step 3: Open video in Aegisub

Video -> Open video… -> Select your workraw (E:workraw.mkv)

Your video will load.

Step 4: Open Audio in Aegisub

Your workraw should have audio muxed in it. You can also open separate audio streams, should the need arise.

Audio -> Open Audio from Video

Your audio will load.

Step 5: Open Keyframes in Aegisub

Video -> Open Keyframes… -> Choose your pass file which you made earlier.
Now your perfect keyframes are loaded.
You are now ready to time.

Troubleshooting: If you have trouble opening video in Aegisub, remuxing the file using mkvmerge.exe (Download mkvtoolnix) usually solves the problem.
Most types of audio will open. All MP3 audio will open with no problems. Some FLAC audio won’t open. Re-encode to mp3 if that is the case (See encoding tutorials)

5. Timing

Remember your shortcuts? You’ll be needing them. I do not add lead ins with shortcut “C”, but I do add lead outs with shortcut “V”.In this tutorial I will be using Seikon No Qwaser S05 (Bluray) (Untimed).ass (Right click, save as)workraw.mkvIt might be easier if you downloaded these and followed what I say as I go along.

Starting the line
When you start the line, there will be 2 scenarios.

1. There isn’t a keyframe near the start of the audio.

If this is the case, just look at where the audio starts, and left click ~125ms before that. As you can see, I choose not to use the shortcut “C” here. I prefer to use my eyes to judge, so it does not have to be exactly 125ms.The audio starts at about 00:03.49. Taking into the account the lead-in, I started the line at 00:03.37 (by left clicking). Remember, I am only estimating 125ms, by using those helpful 250ms markers. 125ms is half of those.

2. There is a keyframe near the start of the audio.

The purple line is a keyframe, and it is within 250ms of the start of the line after you do your normal 125ms of lead in. In this case, you start your line AT the keyframe (0:35.42), like this.

Ending the line

When you start the line, there will be 3 scenarios.

1. There is nothing close to the end of the line. (Keyframes or Adjacent line)

If there is nothing (Keyframes or Adjacent line) then just right-click on where the audio ends, and hit “V” to add on your 500ms lead out.Right click…

Hit “V”…

Then hit “G” to commit and go onto the next line. The next line will start off where you ended the previous line.

2. There is a Keyframe near the end of the line.

If you can see that there is a Keyframe within 500ms of the end of the line after you apply lead out, then you shift+right click near the keyframe to snap the end of the line to the keyframe.Right click (0:09.76)…

Shift+Right click near the purple line (Keyframe) (0:10.82)…

Actually, there is 1020ms of lead out here, but remember what I said about “When in doubt, more is better than less”? I wasn’t sure, so I went for it and snapped to the keyframe which resulted in 1020ms of lead out..Then hit “G” to commit and go onto the next line. The next line will start off where you ended the previous line.

3. There is an Adjacent line near the end of the line.

If you can see that after applying 500ms lead out, the next line is within 500ms of the current line, then you right-click at ~125ms before the audio of the next line starts.Right click (0:53.06)…

Right click at ~125ms before the audio of the next line starts (0:54.25)… (The audio of the next line starts at 0:54.36)

Then hit “G” to commit and go onto the next line. The next line will start off where you ended the previous line. Did you notice that the next line already has lead in (~125ms), because you anticipated this when you were ending the previous line?Here are a few more points to bear in mind. Some may seem obvious, but I’ll mention them anyway.

  • Use “S” to listen to the your selection. This is how you judge where the audio starts and ends.
  • A line must NEVER start in the middle of speech.
  • A line should not end in the middle of speech, expect in rare circumstances, where the line is already quite long, and there happens to be a keyframe right before the speech ends. The last syllable should have already been spoken. In a case like this, you can end the line at the Keyframe, cutting off a tiny bit of speech.
  • Scene changes/Keyframes overrides all. If there is an adjacent line, and a keyframe, the keyframe takes priority, and you end the line at the keyframe.
  • Timing is highly tedious, like I said earlier. Your first attempt at timing an episode may take you 4+ hours. Don’t get discouraged, because you will improve the more you do it. The average timer takes about 1 hour to time a standard 25 minute episode. The fastest timers will take 25 minutes (the length of the show, which makes sense). I take about 45 minutes to time a normal episode, down from 2 hours when I first started.
  • If a line is too long, break them up. The maximum lines on the screen must not exceed 2. Three-liners are fail.

After you have finished timing, you should watch the whole thing once, in Aegisub, or your video player. Even though you have generated a pass file, which has the correct keyframes, it is only 99% accurate.
You have to watch the whole thing once just in case there is 1% chance that a keyframe was incorrectly placed. You are looking for scene bleeds. More on that in the next section.

6. Scene bleeds

Scene bleeds are what will happen if a timer does not snap to keyframes correctly.If a line ends, followed very quickly by a keyframe/scene change (within a few frames), you’ll have the subs disappearing, followed by a change in the scene, which produces a “flashing subs” effect.Similarly, if the line ends, and it goes over the keyframe by a few frames, you’ll have this “flashing subs” effect again.This “flashing subs” effect is known as a “Scene bleed” and is very uncomfortable on the eye. While a few scene bleeds might be inevitable, these will be caught by the QC. But if there are frequent scene bleeds throughout an episode, then the timer has failed. If you follow my instructions above, then you won’t have any scene bleeds. Promise.

7. TPP

TPP stands for Timing Post Processor and is very useful in the hands of an experienced timer.This is a tool which will snap your keyframes for you, add lead in, lead out, and join up adjacent lines.While this may sound useful, in reality, if you used this, you will need to do another pass to check whether it has screwed up or not. This is why I do not suggest TPP be used when you time.

8. Random but important Misc advice

GAPS ARE BAD
CONTINUITY IS GOOD

  • If you ever see a gap under 250ms, it should not be there.
  • If you ever see a gap between 250-500ms, it should only be there if the gap starts with a scene change. Otherwise, it shouldn’t be there.
  • If you understood this timing tutorial and did exactly as I stated, you will never make mistakes related to continuity/gaps.

GAPS ARE BAD
CONTINUITY IS GOOD

    • A line must stay on screen for at least 500ms. The watcher needs time to read the line, and it if disappears too fast, that’s bad.
    • You can add extra lead in/lead out to make up to the 500ms.

 

  • Or you can join up the current short line with the next line.

 

  • Use the D-Pad (left and right) to scroll the video frame by frame.
  • This is especially helpful to check for real/fake keyframes.
  • A scene change is defined as a “sudden change in scene”.
  • You will find that panning scenes, fading scenes will generate a lot of fake keyframes.
  • Typically, the first keyframe in a row of keyframes will be the real one. Not always though, so do check.
  • If you see many keyframes close together, I can guarantee that a lot of them are fake.
  • An average timer will typically take no more than 3-4 hours to time a standard 24 minute script.
  • A good timer does this in 1.5-2hours
  • Crazy timers will do this in 30 minutes, but those are rare.
  • This does not include the second pass, when you would rewatch the episode to check for errors, so add another 30 minutes for that.