Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>A few years ago I wrote a simplistic GEDCOM to XML translator in Python as part of a <a href="http://hewgill.com/journal/entries/277-genealogy" rel="noreferrer">larger project</a>. I found that dealing with the GEDCOM data in an XML format was much easier (especially when the next step involved XSLT).</p> <p>I don't have the code online at the moment, so I've pasted the module into this message. This works for me; no guarantees. Hope this helps though.</p> <pre><code>import codecs, os, re, sys from xml.sax.saxutils import escape fn = sys.argv[1] ged = codecs.open(fn, encoding="cp437") xml = codecs.open(fn+".xml", "w", "utf8") xml.write("""&lt;?xml version="1.0"?&gt;\n""") xml.write("&lt;gedcom&gt;") sub = [] for s in ged: s = s.strip() m = re.match(r"(\d+) (@(\w+)@ )?(\w+)( (.*))?", s) if m is None: print "Error: unmatched line:", s level = int(m.group(1)) id = m.group(3) tag = m.group(4) data = m.group(6) while len(sub) &gt; level: xml.write("&lt;/%s&gt;\n" % (sub[-1])) sub.pop() if level != len(sub): print "Error: unexpected level:", s sub += [tag] if id is not None: xml.write("&lt;%s id=\"%s\"&gt;" % (tag, id)) else: xml.write("&lt;%s&gt;" % (tag)) if data is not None: m = re.match(r"@(\w+)@", data) if m: xml.write(m.group(1)) elif tag == "NAME": m = re.match(r"(.*?)/(.*?)/$", data) if m: xml.write("&lt;forename&gt;%s&lt;/forename&gt;&lt;surname&gt;%s&lt;/surname&gt;" % (escape(m.group(1).strip()), escape(m.group(2)))) else: xml.write(escape(data)) elif tag == "DATE": m = re.match(r"(((\d+)?\s+)?(\w+)?\s+)?(\d{3,})", data) if m: if m.group(3) is not None: xml.write("&lt;day&gt;%s&lt;/day&gt;&lt;month&gt;%s&lt;/month&gt;&lt;year&gt;%s&lt;/year&gt;" % (m.group(3), m.group(4), m.group(5))) elif m.group(4) is not None: xml.write("&lt;month&gt;%s&lt;/month&gt;&lt;year&gt;%s&lt;/year&gt;" % (m.group(4), m.group(5))) else: xml.write("&lt;year&gt;%s&lt;/year&gt;" % m.group(5)) else: xml.write(escape(data)) else: xml.write(escape(data)) while len(sub) &gt; 0: xml.write("&lt;/%s&gt;" % sub[-1]) sub.pop() xml.write("&lt;/gedcom&gt;\n") ged.close() xml.close() </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload