Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p><strong>This is the process of building a phone number scraping regex.</strong></p> <p>First, we need to match an area code (3 digits), a trunk (3 digits), and an extension (4 digits):</p> <pre><code>reg = re.compile("\d{3}\d{3}\d{4}") </code></pre> <p>Now, we want to capture the matched phone number, so we add parenthesis around the parts that we're interested in capturing (all of it):</p> <pre><code>reg = re.compile("(\d{3}\d{3}\d{4})") </code></pre> <p>The area code, trunk, and extension might be separated by up to 3 characters that are not digits (such as the case when spaces are used along with the hyphen/dot delimiter):</p> <pre><code>reg = re.compile("(\d{3}\D{0,3}\d{3}\D{0,3}\d{4})") </code></pre> <p>Now, the phone number might actually start with a <code>(</code> character (if the area code is enclosed in parentheses):</p> <pre><code>reg = re.compile("(\(?\d{3}\D{0,3}\d{3}\D{0,3}\d{4}).*?") </code></pre> <p>Now that whole phone number is likely embedded in a bunch of other text:</p> <pre><code>reg = re.compile(".*?(\(?\d{3}\D{0,3}\d{3}\D{0,3}\d{4}).*?") </code></pre> <p>Now, that other text might include newlines:</p> <pre><code>reg = re.compile(".*?(\(?\d{3}\D{0,3}\d{3}\D{0,3}\d{4}).*?", re.S) </code></pre> <p><strong>Enjoy!</strong></p> <p>I personally stop here, but if you really want to be sure that only spaces, hyphens, and dots are used as delimiters then you could try the following (untested):</p> <pre><code>reg = re.compile(".*?(\(?\d{3})? ?[\.-]? ?\d{3} ?[\.-]? ?\d{4}).*?", re.S) </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload