Note that there are some explanatory texts on larger screens.

plurals
  1. POEfficient way to do a large number of search/replaces in Python?
    primarykey
    data
    text
    <p>I'm fairly new to Python, and am writing a series of script to convert between some proprietary markup formats. I'm iterating line by line over files and then basically doing a large number (100-200) of substitutions that basically fall into 4 categories:</p> <pre><code>line = line.replace("-","&lt;EMDASH&gt;") # Replace single character with tag line = line.replace("&lt;\\@&gt;","@") # tag with single character line = line.replace("&lt;\\n&gt;","") # remove tag line = line.replace("\xe1","&amp;bull;") # replace non-ascii character with entity </code></pre> <p>the str.replace() function seems to be pretty efficient (fairly low in the numbers when I examine profiling output), but is there a better way to do this? I've seen the re.sub() method with a function as an argument, but am unsure if this would be better? I guess it depends on what kind of optimizations Python does internally. Thought I would ask for some advice before creating a large dict that might not be very helpful!</p> <p>Additionally I do some parsing of tags (that look somewhat like HTML, but are not HTML). I identify tags like this:</p> <pre><code>m = re.findall('(&lt;[^&gt;]+&gt;)',line) </code></pre> <p>And then do ~100 search/replaces (mostly removing matches) within the matched tags as well, e.g.:</p> <pre><code>m = re.findall('(&lt;[^&gt;]+&gt;)',line) for tag in m: tag_new = re.sub("\*t\([^\)]*\)","",tag) tag_new = re.sub("\*p\([^\)]*\)","",tag_new) # do many more searches... if tag != tag_new: line = line.replace(tag,tag_new,1) # potentially problematic </code></pre> <p>Any thoughts of efficiency here?</p> <p>Thanks!</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload