Differences
This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
regex [2021/03/18 20:06] – created niklas | regex [2024/02/14 12:20] (current) – external edit 127.0.0.1 | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | remove extra attributes to | + | ====== Useful regular expressions ====== |
+ | |||
+ | A relatively advanced way to potentially semi-automate a few processes is the use of regular expressions. These are basically an advanced form of “find and replace” which can be very powerful. When used properly, they can clean up certain code and properly format things like footnotes in record time, or convert square bracket< | ||
+ | |||
+ | The easiest way to understand the concept is to use a simple one. A fairly typical export from InDesign will have Headers as a paragraph tag, which isn't much use for HTML/ | ||
+ | |||
+ | Rather than | ||
+ | |||
+ | ''< | ||
+ | |||
+ | we would like | ||
+ | |||
+ | ''< | ||
+ | |||
+ | Using regex, for the initial sequence (the " | ||
+ | |||
+ | ''< | ||
+ | |||
+ | '' | ||
+ | |||
+ | To convert this to a proper <h1> tag we would replace this with: | ||
+ | |||
+ | ''< | ||
+ | |||
+ | (with '' | ||
+ | |||
+ | https:// | ||
+ | |||
+ | MV's plain text file {{ ::regex-mv.pdf |}} used for producing ebooks may be useful to get a better idea. | ||
+ | |||
+ | Below are some further examples. | ||
+ | |||
+ | |||
+ | =====Clean paragraph tags===== | ||
+ | Replace | ||
< | < | ||
- | remove span tags: | + | with |
+ | < | ||
+ | =====remove span tags:===== | ||
+ | Replace | ||
< | < | ||
- | **To h4:** | + | with nothing |
+ | |||
+ | =====Remove empty paragraph tags: | ||
+ | Replace | ||
+ | < | ||
+ | with nothing | ||
+ | |||
+ | Replace | ||
+ | < | ||
+ | with nothing | ||
+ | |||
+ | =====Bold paragraphs to h4:===== | ||
+ | Replace | ||
< | < | ||
< | < | ||
< | < | ||
- | Replace with: | + | With: |
< | < | ||
< | < | ||
</ | </ | ||
- | **Remove empty p tags:** | + | =====Handling footnotes:===== |
- | < | + | |
- | **Handling footnotes:** | + | |
< | < | ||
<a (href="# | <a (href="# | ||
Line 34: | Line 81: | ||
\1< | \1< | ||
</ | </ | ||
- | **Finding and replacing double quotation:** | + | =====Finding and replacing double quotation:===== |
< | < | ||
(?< | (?< | ||
Line 44: | Line 91: | ||
< | < | ||
+ | |||
+ | =====Removing line breaks in code:===== | ||
+ | This is useful if above isn't working because of line breaks. | ||
+ | Replace | ||
+ | < | ||
+ | with | ||
+ | < | ||
+ | Replace | ||
+ | < | ||
+ | with | ||
+ | < |