Note that there are some explanatory texts on larger screens.

plurals
  1. PONatural Language parser for parsing sports play-by-play data
    primarykey
    data
    text
    <p>I'm trying to come up with a parser for football plays. I use the term "natural language" here very loosely so please bear with me as I know little to nothing about this field.</p> <p>Here are some examples of what I'm working with (Format: TIME|DOWN&amp;DIST|OFF_TEAM|DESCRIPTION):</p> <pre><code>04:39|4th and 20@NYJ46|Dal|Mat McBriar punts for 32 yards to NYJ14. Jeremy Kerley - no return. FUMBLE, recovered by NYJ.| 04:31|1st and 10@NYJ16|NYJ|Shonn Greene rush up the middle for 5 yards to the NYJ21. Tackled by Keith Brooking.| 03:53|2nd and 5@NYJ21|NYJ|Mark Sanchez rush to the right for 3 yards to the NYJ24. Tackled by Anthony Spencer. FUMBLE, recovered by NYJ (Matthew Mulligan).| 03:20|1st and 10@NYJ33|NYJ|Shonn Greene rush to the left for 4 yards to the NYJ37. Tackled by Jason Hatcher.| 02:43|2nd and 6@NYJ37|NYJ|Mark Sanchez pass to the left to Shonn Greene for 7 yards to the NYJ44. Tackled by Mike Jenkins.| 02:02|1st and 10@NYJ44|NYJ|Shonn Greene rush to the right for 1 yard to the NYJ45. Tackled by Anthony Spencer.| 01:23|2nd and 9@NYJ45|NYJ|Mark Sanchez pass to the left to LaDainian Tomlinson for 5 yards to the 50. Tackled by Sean Lee.| </code></pre> <p>As of now, I've written a dumb parser that handles all the easy stuff (playID, quarter, time, down&amp;distance, offensive team) along with some scripts that goes and gets this data and sanitizes it into the format seen above. A single line gets turned into a "Play" object to be stored into a database.</p> <p>The tough part here (for me at least) is parsing the description of the play. Here is some information that I would like to extract from that string:</p> <p>Example string:</p> <pre><code>"Mark Sanchez pass to the left to Shonn Greene for 7 yards to the NYJ44. Tackled by Mike Jenkins." </code></pre> <p>Result:</p> <pre><code>turnover = False interception = False fumble = False to_on_downs = False passing = True rushing = False direction = 'left' loss = False penalty = False scored = False TD = False PA = False FG = False TPC = False SFTY = False punt = False kickoff = False ret_yardage = 0 yardage_diff = 7 playmakers = ['Mark Sanchez', 'Shonn Greene', 'Mike Jenkins'] </code></pre> <p>The logic that I had for my initial parser went something like this:</p> <pre><code># pass, rush or kick # gain or loss of yards # scoring play # Who scored? off or def? # TD, PA, FG, TPC, SFTY? # first down gained # punt? # kick? # return yards? # penalty? # def or off? # turnover? # INT, fumble, to on downs? # off play makers # def play makers </code></pre> <p>The descriptions can get pretty hairy (multiple fumbles &amp; recoveries with penalties, etc) and I was wondering if I could take advantage of some NLP modules out there. Chances are I'm going to spend a few days on a dumb/static state-machine like parser instead but if anyone has suggestions on how to approach it using NLP techniques I'd like to hear about them.</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload