Regular expressions

Marek Pawelec

14 & 15 July 2022, 10:00 -16:00 CET

Course content

Regular expressions (regexes for short) are special rules designed to find text and/or numbers meeting certain criteria and to do something useful with it. An example of “something useful” may include finding all occurrences of dates in any given format (e.g. 03/24/2013) and converting them to a different format (e.g. 24.03.2013), conversion of number formats, removal of multiple spaces from the whole document in one go or joining incorrectly split paragraphs.

Regular expressions are extremely useful in processing text and numbers when preparing a text for translation, for example for cleaning up text extracted from PDF files, and in the translation process itself – one can use regexes in SDL Studio and memoQ to perform a variety of actions. And while it is usually relatively simple to create the necessary regex to match a particular text, quite often the trick is to write a regex which will match only that text, and nothing more.

The problem most people have when it comes to regexes is that they look somewhat scary and mysterious. In reality, once you know the meaning of symbols used and some basic rules, most of the time regexes are quite simple and logical. The workshop is designed to introduce regular expressions to anyone without prior knowledge and provide help and inspiration for people with basic to intermediate knowledge. We will start off from very basic up to relatively complex rules with emphasis on translation-related applications, based on real-life problems and files. After the workshop you should be able to use regexes for efficient text processing and create or modify rules to match complex text strings.

Topics covered

  • Text editing in MS Word.
  • Editing a range of text formats in Notepad++.
  • Converting text into tags.
  • Using auto-translatable elements.
  • Creating and editing segmentation rules.
  • Using regular expressions for filtering in CAT tools and Find and Replace.
  • Defining custom QA rules in CAT tools and QA software.
  • Defining filters for importing non-standard files into memoQ, SDL Studio, WordFast and open-source tools.

Participants will receive handouts with regular expression vocabulary and a detailed description of all rules created and used during the training.

Who should attend?

  • Translators who want to learn about RegEx, either from scratch or to learn additional regex options.

Event details

Date: 14 and 15 July 2022
Location: Singelkerk, Wittevrouwensingel 28, Utrecht
Time: From 10:00 to 16:00 CET
Early-bird price: €299.00 (excluding VAT). Valid until 31 May 2022.
Regular price: €349.00 (excluding VAT, includes light lunch, tea/coffee)
Student discount: 20%
Registration: Please register by filling in the form below or sending us an e-mail stating the workshop you would like to attend. We will send an invoice shortly after.
Max. number of attendees: The workshop is for a maximum of 12 attendees.
Remarks: If we cannot hold it live we will reschedule to a later date approximately three months later. But as both Marek and Ellen will have had both vaccines this is not probable.