Note that there are some explanatory texts on larger screens.

plurals
  1. POGet address out of a paragraph with regex
    primarykey
    data
    text
    <p>Alright, this one's a bit of a pain. I'm doing some scraping with Python, trying to get an address out of a few lines of poorly tagged HTML. Here's a sample of the format:</p> <pre><code>256-555-5555&lt;br/&gt; 1234 Fake Ave S&lt;br/&gt; Gotham (Lower Ward)&lt;br/&gt; </code></pre> <p>I'd like to retrieve only <code>1234 Fake Ave S, Gotham</code>. Any ideas? I've been doing regex's all night and now my brain is mush...</p> <p>Edit: More detail about what the possible scenarios of how the data will arrive. Sometimes the first line will be there, sometimes not. All of the addresses I have seen have Ave, Way, St in it although I would prefer not to use that as a factor in the selection as I am not certain they will always be that way. The second and third line are alPhone (or possible email or website): </p> <p>What I had in mind was something that </p> <ol> <li>Selects everything on 2nd to last line (so, second line if there are three lines, first line if just two when there isn't a phone number).</li> <li>Selects everything on last line that isn't in parentheses.</li> <li>Combine the 2nd to last line and last line, adding a ", " in between the two.</li> </ol> <p>I'm using Scrapy to acquire the HTML code. The address is all in the same div, I want to use regex to further break the data up into appropriate sections. Now how to do that is what I'm unable to figure out.</p> <p>Edit2:</p> <p>As per Ofir's comment, I should mention that I have already made expressions to isolate the phone number and parentheses section.</p> <p>Phone (or possible email or website): </p> <pre><code>((1[-. ])?[0-9]{3}[-. ])?\(?([0-9]{3}[-. ][A?([0-9]{4})|([\w\.-]+@[\w\.-]+)|(www.+)|([\w\.-]*(?:com|net|org|us)) </code></pre> <p>parentheses: </p> <pre><code>\((.*?)\) </code></pre> <p>I'm not sure how to use those to construct a everything-but-these statement.</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload