Extracting Hardsubs Introduction
For many aspiring re-releasers, re-releasers, one of the biggest obstacles is working with w ith older (mainly pre-200! anime whose original fansubs were "00# hardsubbed$ %et&s face it, any newer anime likely has softsubs easily a'ailable, and is going to be co'ered by half a doen )lu-*ay encoding+raw-remuxing encoding+raw-remuxing groups$ lder hardsubbed series represent the greatest percentage of potential re-release proects, in addition to being the series most in need of updated 'ersions with better audio+'ideo .uality$ .uality$ /nd if one isn&t is n&t doing a full line-by-line retiming, edit, and translation check, obtaining scripts from hardsubbed 'ideos is the most dicult and time-consuming part of the process$ 1ince most of the proects work on in'ol'e series whose original fansubs were "00# hardsubbed, thought &d share a few tips and tricks on the optimal ways of obtaining these scripts, and compare some ad'antages and disad'antages of each$ 3hese guides assume you can obtain and and learn the basic functions of /egisub$ /lso, the /egisub numerical 'alues assume that you&re putting these subs on a 450p 676 encode$ /dust them accordingly for higher-resolution sources$ Method Zero: Obtaining Scripts Directly
f the original fansub group is still acti'e, contact them through standard standard channels (website, forum, e-mail, *8!, *8!, explain who you are and what w hat you&re doing, and ask nicely for the scripts for the show in .uestion$ f the group has disbanded, check the sta9 credits in the 'ideo (most older fansubs will ha'e them!, and track down the indi'iduals 'ia *8, /nime1uki, or :/%$ 3his can be the easiest method, but it also has a low probability of success$ ;hile styled softsubs external $ssa releases (as seen with some *" 676-rips by /nimeH?!, many groups were protecti'e of their scripts and used hardsubbing as a way to pre'ent people from$$$ well, doing exactly what we&re doing here$ 1o some fansubbers+groups may not be willing to share scripts$ /nd e'en if they "@Aage
are, the scripts may ha'e been lost to F3A or H66 crashes, and thus no longer exist anywhere outside of the hardsubbed 'ideos$ Advantages:
B f original fansub scripts can be obtained, it e9ecti'ely turns your hardsub -C softsub proect into a softsub -C softsub proect, eliminating the need to obtain scripts manually$ B Ensures that no errors will be introduced, unless you do further editing to the scripts$ Disadvantages:
B *educed chances of success, since it relies on others ha'ing the scripts and being willing to share them$ B 1ince many groups used /fter E9ects (/FD! for their typesetting and karaoke, the scripts you recei'e might only ha'e the dialogue, forcing you to redo signs and songs yourself$
Method One: Optical Character Recognition (OCR):
8* is a method that scans images,
2@Aage
"! )efore starting in with 1ubrip, open one of the 37-fansubs in /egisub$ Jse /egisub&s 8olor Aicker to determine the color hex 'alues of the subtitles$ Kow when you&re in 1ub*ip, you can manually enter those 'alues if the automatic color detection doesn&t get the right colors$ /s the guide suggests, try to Enter to mark them as blank spaces, e'en if they appear in a space between two letters$ 8hances are, you&ll later see words broken by spaces if 1ub*ip sees a similar mark between two letters within a word$ /utomatic spell-checkers like /egisub&s ha'e a much easier time correcting missing spaces than additional ones$ recommend pausing 1ub*ip and skipping o'er A+E6 songs, since karaoke subs are often placed di9erently from dialogue subs, use weird fonts, and ha'e funky e9ects -- all of these will confuse 1ub*ip and slow you down$ Pust go back later with /egisub to =@Aage
manually retime and type those yourself$ 4! ;hile 1ub*ip&s automatic correction can :argin 'erride$ *un spell-check, and add all common names and series-speci*eplace can also be useful in correcting 8* errors not noticed by spellcheck$ Lo through all the subs line-by-line to check punctuation and consistency with the original subs, as 1ub*ip often omits or adds periods or other punctuation marks$ f course, if the original subs had spelling errors or other typos, paste the song lyrics in, and timeshift them if necessary$ Advantages:
B For subs with high 8*-ability, 1ub*ip can run fairly .uickly and automatically, once text and color settings are optimied and a character matrix is established$ B nly a small percentage of the hardsubbed text actually needs to be entered by you, thus allowing you to multitask$ ( prefer to throw on an English-dubbed rewatch anime on my adacent 37$! B 8* is the best way to automatically get timings for shows where no timed 4@Aage
scripts (in any language, see below! are a'ailable$ Disadvantages:
B *e.uires a %3 of work ust to learn and set up$ B 6oes not work on all 'arieties of hardsubs$ B ne'itably introduces many errors and omissions, which must be manually corrected$ B 6oes not detect specialty text outside a gi'en area, so e'en if it works perfectly, some manual retyping and retiming is still necessary$ B 6epending on diculty and A8 speed, will likely take 2x an episode&s run time to scan a single episode$ :ay take longer$
Method Two: Transcription
First o9, some may be under the misconception that transcription in'ol'es watching hardsubbed 'ideos, pausing e'ery few seconds to type things into a text
8hances are, the show in .uestion probably doesn&t ha'e English softsubs a'ailable$ (/nd if it does, ust use those and edit+Qde-Rlocalie them to your preferences$! Howe'er, there&s a good chance that subbers o'er in *ussia ha'e subbed the show and based their scripts on the most-respected English hardsubs$ 1o head o'er to 1ubs$*J, search for the show you want, and grab the *J subtitle archi'e$ Kow, it&s time to get things set up$ "! f there are any *eadme or 8omments text *eplace to replace SK with a blank space$ 3hen, copy roughly half the lines, and paste them into Loogle 3ranslate$ 8opy the resulting translation, and use Aaste 'er (1hift>8trl>7! in /egisub to paste the autotranslated English text o'er the *ussian$ *epeat with the second half of the script$ (3rying to do the whole script at once will run o'er Loogle-3%&s length O@Aage
limits, and cut o9 numerous lines$ SK&s will also cause SKQsome *J wordR to appear in the translated text$! -- /lso, this can work with scripts in other non-English languages, mainly from European ones that also translate to their nati'e languages from the English fansubs$ 1ubs in other /sian languages like 8hinese can work$ Howe'er, don&t recommend them as they&ll be original PA-C8K translations, and their timings likely won&t mesh well with the English subs$ 2! nce you ha'e a script of semi-comprehensible English text, select all lines and do the followingT B n the 1hift 3imes dialogue box (83*%>!, shift all lines forward 0$=0 seconds$ 3his is to ensure that the subtitles in the script always appear slightly after the subtitles in the hardsubbed 'ideo$ B 8hange the 'ertical margin to U400, so that the subs appear extremely high on the image, but not at the 'ery top of the screen$ B ;ith all lines selected, right-click and press 6uplicate$ 1elect all the duplicate lines, and change the 'ertical margin to UV0$ 8lear all text from the duplicate lines by deleting the text from one line while all duplicate lines are selected$ =! /fter all that, you should ha'e a script with U=O0 auto-translated, brokenEnglish lines appearing near the top of the screen, and an e.ual number of blank lines with the same start+end times$ %oad the hardsubbed 'ideo into /egisub$ Lo to the blank lines, and begin typing in the text from the hardsubs, making whate'er changes and edits you deem necessary$ )e wary of right spelling, wrong word typos such as you&re+your, out+our, not+now, is+if, and the like, as these will be harder to spot later$ /egisub&s spellcheck can handle most outright errors, so if you see you&'e made an ob'ious mistake (red wa'y underlined word!, ust mo'e on to the next line and
within the lines as the *J subbers timed them$ 3his a'oids accidental skipping, combining, or other mis-entries of content while transcribing$ f the *J lines are BshorterB than the English lines, i$e$ two or = short *J lines co'er the same dialogue as one long EKL line, ust enter e'erything in the EKL line into the F to A$ f course, you can feel free to oin or split lines later on, if you feel the original English or *J times are too fragmented (lots of short lines with almost no time on screen to read them! or too drawn-out (lengthy lines that stay up too long and gi'e away too much information too soon!$ try to strike a middle ground and keep lines between 2-O seconds in length, where possible$ O! 1tylingT usually use 6efault as something simple to make the *J subs easily readable (often a 676-yellow bold /rial!, and set up another style like :ain for the actual English dialogue$ 8hances are, you&ll want to use a few more styles, such as an /lternate for o'erlapping dialogue, an talic style for thoughts, and+or a top-aligned style for background-type lines$ f you&re more masochistic, you may also want to do di9erent colors for di9erent characters, or to di9erentiate between onscreen+o9screen dialogue$ Iou can use /egisub&s features to combine the styling process with transcription$ do this by adding special symbols to lines where want a di9erent style, like W for /lternate+'erlap, X for 3hought, Y for 3op, and so forth$ :ake sure they&re symbols that are unlikely to appear in the actual dialogue$ nce the transcription is complete, use /egisub&s 1ubtitles C 1elect %ines$$$ option to select all lines with a gi'en symbol$ 1et the styles for it, then use Find>*eplace to change all those symbols to nothing$ *epeat for all the symbols you&'e used$ For more complex styling, like color-coding by character, or the numerous styles used with 3onaguraG, you&ll likely ha'e to go through line-by-line to set M@Aage
the styles$ 1etting up hotkeys like F"" and F"2 for Llobal Are' %ine and Llobal Kext %ine helps, as does /egisub&s 1tyling /ssistant$ 3o do onscreen+o9screen styling correctly, you will need to 'iew the beginning and end of each line in /egisub or a real-time watch, and shift between them if the character mo'es on- or o9-camera during the line$ 3his can be done with St tags for gradual shifts, or by duplicating the line and adusting start and end times for gradual shifts$ *efer to my 3onaguraG scripts for examples of this$ 3he elements that constitute BgoodB styling could easily
!lternate "ethod: #rute$%orce Transcription
3his method is a last resort, to be used when you cannot
/udio>1ubs 7iew$ t&s best to err on the side of caution, and fa'or shorter line times$ t&s easier to enter in a longer line and oin subse.uent lines into it, than it is to start and stop the 'ideo while transcribing to enter se'eral short lines from the hardsubs into one longer-timed line$ f course, if you ha'e a feel for how the original fansub group did their timing, you can adust your pre-timing to
B 8an work on any hardsubs, regardless of their nature, or whether non-English scripts are a'ailable or not$ B nce setup process is learned, is as easy as typing on a word processor$ B Fast with proper setup, only limited by one&s typing speed$ B ;ith proper care, can a'oid introducing errors and e'en
B %abor-intensi'e, can be tedious$ B 8annot multitask, aside from maybe listening to music$ B 8an introduce typos and other errors that can be hard to detect$ B Kot e'erything may be co'ered in non-English scripts -- some manual addition and retiming may still be necessary$ B 3iming in non-English scripts might not line up with English fansub timing, thus adding extra work$ Co"parisons and Conclusions:
b'iously, getting 37-rip scripts directly from original fansub sta9ers or softsubbed $mk's is the easiest route to go, doubly so if signs and+or karaoke were softsubbed$ )etween 8* and transcription, ha'e to choose V@Aage
transcription$ ;hile the semi-automatic nature of 8* is nice (and burned through many backlogged rewatches!, the errors it introduces are aggra'ating$ 3ranscribing an episode takes more keystrokes, but no more time than an a'erage 8* ob -- no more than 40-O0 minutes per ep, while 8* can run longer due to color issues and bad luck$ 3ranscribing allows me to
1ource J*% T httpT++redonesubs$blogspot$in+p+extracting-hardsubs$htmlm["
"0 @ A a g e