MAH Web Page Content Export
Paste URLs, one per line:
Content replacements:
# Syntax: # REPLACE 'search pattern' WITH 'replace pattern' # # Case insensitive global replacement. # eval in replacement string is not supported ( \1, $1, \n ). # Use {PLUS} instead of + sign. # Change Tridion image folder to other folder: REPLACE '
]*>[\s\n]*<\/div>' WITH '' REPLACE '
]*>[\s\n]*<\/div>' WITH '' # Remove inline script and style: REPLACE '<(no)?(script|style)(.|\n){PLUS}?<\/(no)?(script|style)>' WITH '' # Remove Microsoft html attributes (MS Word) # Try http://www.msd-tiergesundheit.de/products/bravecto_loesung_hunde/bravecto_loesung_hunde.aspx REPLACE '\s{PLUS}style="[^"]*mso\-[^"]{PLUS}"' WITH '' REPLACE '\s{PLUS}class="MsoNormal[^"]*"' WITH '' # more examples... # Remove inline style attributes: # REPLACE '\s*style="[^"]*"' WITH '' # Remove span elements: # REPLACE '
]*>' WITH '' # REPLACE '
' WITH ''
recursive (crawl entire website)
v. 2019-06-04
walter.soldierer@merck.com