Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>I'm not sure how easy this is, since it does make use of some more advanced concepts like generators, but it's at least robust and well-documented. The actual code is at the bottom and is fairly concise.</p> <p>The basic idea is that the function <code>iter_delim_sets</code> returns an iterator over (aka a sequence of) tuples containing the line number, the set of indices in the "expected" string where the delimiter was found, and a similar set for the "actual" string. There's one such tuple generated for each pair of (expected, result) lines. Those tuples are succinctly formalized into a <code>collections.namedtuple</code> type called <code>DelimLocations</code>.</p> <p>Then the function <code>analyze</code> just returns higher-level information based on such a data set, stored in a <code>DelimAnalysis</code> <code>namedtuple</code>. This is done using basic set algebra.</p> <pre><code>"""Compare two sequences of strings. Test data: &gt;&gt;&gt; from pprint import pprint &gt;&gt;&gt; delimiter = '||' &gt;&gt;&gt; expected = ( ... delimiter.join(("one", "fish", "two", "fish")), ... delimiter.join(("red", "fish", "blue", "fish")), ... delimiter.join(("I do not like them", "Sam I am")), ... delimiter.join(("I do not like green eggs and ham.",))) &gt;&gt;&gt; actual = ( ... delimiter.join(("red", "fish", "blue", "fish")), ... delimiter.join(("one", "fish", "two", "fish")), ... delimiter.join(("I do not like spam", "Sam I am")), ... delimiter.join(("I do not like", "green eggs and ham."))) The results: &gt;&gt;&gt; pprint([analyze(v) for v in iter_delim_sets(delimiter, expected, actual)]) [DelimAnalysis(index=0, correct=2, incorrect=1, count_diff=0), DelimAnalysis(index=1, correct=2, incorrect=1, count_diff=0), DelimAnalysis(index=2, correct=1, incorrect=0, count_diff=0), DelimAnalysis(index=3, correct=0, incorrect=1, count_diff=1)] What they mean: &gt;&gt;&gt; pprint(delim_analysis_doc) (('index', ('The number of the lines from expected and actual', 'used to perform this analysis.')), ('correct', ('The number of delimiter placements in ``actual``', 'which were correctly placed.')), ('incorrect', ('The number of incorrect delimiters in ``actual``.',)), ('count_diff', ('The difference between the number of delimiters', 'in ``expected`` and ``actual`` for this line.'))) And a trace of the processing stages: &gt;&gt;&gt; def dump_it(it): ... '''Wraps an iterator in code that dumps its values to stdout.''' ... for v in it: ... print v ... yield v &gt;&gt;&gt; for v in iter_delim_sets(delimiter, ... dump_it(expected), dump_it(actual)): ... print v ... print analyze(v) ... print '======' one||fish||two||fish red||fish||blue||fish DelimLocations(index=0, expected=set([9, 3, 14]), actual=set([9, 3, 15])) DelimAnalysis(index=0, correct=2, incorrect=1, count_diff=0) ====== red||fish||blue||fish one||fish||two||fish DelimLocations(index=1, expected=set([9, 3, 15]), actual=set([9, 3, 14])) DelimAnalysis(index=1, correct=2, incorrect=1, count_diff=0) ====== I do not like them||Sam I am I do not like spam||Sam I am DelimLocations(index=2, expected=set([18]), actual=set([18])) DelimAnalysis(index=2, correct=1, incorrect=0, count_diff=0) ====== I do not like green eggs and ham. I do not like||green eggs and ham. DelimLocations(index=3, expected=set([]), actual=set([13])) DelimAnalysis(index=3, correct=0, incorrect=1, count_diff=1) ====== """ from collections import namedtuple # Data types ## Here ``expected`` and ``actual`` are sets DelimLocations = namedtuple('DelimLocations', 'index expected actual') DelimAnalysis = namedtuple('DelimAnalysis', 'index correct incorrect count_diff') ## Explanation of the elements of DelimAnalysis. ## There's no real convenient way to add a docstring to a variable. delim_analysis_doc = ( ('index', ("The number of the lines from expected and actual", "used to perform this analysis.")), ('correct', ("The number of delimiter placements in ``actual``", "which were correctly placed.")), ('incorrect', ("The number of incorrect delimiters in ``actual``.",)), ('count_diff', ("The difference between the number of delimiters", "in ``expected`` and ``actual`` for this line."))) # Actual functionality def iter_delim_sets(delimiter, expected, actual): """Yields a DelimLocations tuple for each pair of strings. ``expected`` and ``actual`` are sequences of strings. """ from re import escape, compile as compile_ from itertools import count, izip index = count() re = compile_(escape(delimiter)) def delimiter_locations(string): """Set of the locations of matches of ``re`` in ``string``.""" return set(match.start() for match in re.finditer(string)) string_pairs = izip(expected, actual) return (DelimLocations(index=index.next(), expected=delimiter_locations(e), actual=delimiter_locations(a)) for e, a in string_pairs) def analyze(locations): """Returns an analysis of a DelimLocations tuple. ``locations.expected`` and ``locations.actual`` are sets. """ return DelimAnalysis( index=locations.index, correct=len(locations.expected &amp; locations.actual), incorrect=len(locations.actual - locations.expected), count_diff=(len(locations.actual) - len(locations.expected))) </code></pre>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload