StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POGroovy Regex illegal Characters
primarykey
Id
3240356
data
AcceptedAnswerId
3242087
AnswerCount
4
ClosedDate
CommentCount
0
CommunityOwnedDate
CreationDate
2010-07-13T18:45:00.347
FavoriteCount
0
LastActivityDate
2017-01-31T11:47:11.597
LastEditDate
2017-01-31T11:47:11.597
LastEditorUserId
3288649
OwnerUserId
53133
ParentId
0
PostTypeId
1
Score
1
ViewCount
2125
LastEditorDisplayName
text
Body
I have a Groovy script that converts some very poorly formatted data into XML. This part works fine, but it's also happily passing some characters along that aren't legal in XML. So I'm adding some code to strip these out, and this is where the problem is coming from. The code that isn't compiling is this: <code>def illegalChars = ~/[\u0000-\u0008]|[\u000B-\u000C]|[\u000E-\u001F]|[\u007F-\u009F]/</code> What I'm wondering is, why? What am I doing wrong here? I tested this regex in <a href="http://regexpal.com/" rel="nofollow noreferrer">http://regexpal.com/</a> and it works as expected, but I'm getting an error compiling it in Groovy: <blockquote> [ERROR] BUILD ERROR [INFO] ------------------------------------------------------------------------ [INFO] line 23:26: unexpected char: 0x0 </blockquote> The line above is <code>line 23</code>. The surrounding lines are just variable declarations that I haven't changed while working on the regex. Thanks! Update: The code compiles, but it's not filtering as I'd expected it to. In regexpal I put the regex: <blockquote> [\u0000-\u0008\u000B-\u000C\u000E-\u001F\u007F-\u009F] </blockquote> and the test data: <pre><code>name='lang'>E</field><field name='title'>CHEMICAL IMMUNOLOGY AND ALLERGY</field></doc> <doc><field name='page'>72-88</field><field name='shm'>3146.757500</field><field name='pubc'>47</field><field name='cs'>1</field><field name='issue'>NUMBER</field> <field name='auth'>Dvorak, A.</field><field name='pub'>KARGER</field><field name='rr'>GBP013.51</field><field name='issn'>1660-2242</field><field name='class1'>TS</field><field name='freq'>S</field><field name='class2'>616.079</field><field name='text'>Subcellular Localization of the Cytokines, Basic Fibroblast Growth Factor and Tumor Necrosis Factor- in Mast Cells</field><field name='id'>RN170369808</field><field name='volume'>VOL 85</field> <field name='year'>2005</field><field name='lang'>E</field><field name='title'>CHEMICAL IMMUNOLOGY AND ALLERGY</field></doc><doc><field name='page'>89-97</field><field name='shm'>3146.757500</field><field name='pubc'>47</field><field name='cs'>1</field><field </code></pre> It's a grab from a file with one of the illegal characters, so it's a little random. But regexpal highlights only the illegal character, but in Groovy it's replacing even the '<' and '>' characters with empty strings, so it's basically annihilating the entire document. The code snippet: <pre><code> def List parseFile(File file){ println "reading File name: ${file.name}" def lineCount = 0 List data = new ArrayList() file.eachLine { String input -> lineCount ++ String line = input if(input =~ illegalChars){ line = input.replaceAll(illegalChars, " ") } Map document = new HashMap() elementNames.each(){ token -> def val = getValue(line, token) if(val != null){ if(token.equals("ISSUE")){ List entries = val.split(";") document.putAt("year",entries.getAt(0).trim()) if(entries.size() > 1){ document.putAt("volume", entries.getAt(1).trim()) } if(entries.size() > 2){ document.putAt("issue", entries.getAt(2).trim()) } } else { document.putAt(token, val) } } } data.add(document) } println "done" return data } </code></pre> I don't see any reason that the two should behave differently; am I missing something? Again, thanks!
Tags
<regex><groovy>
Title
Groovy Regex illegal Characters
singulars
PostAcceptedAnswerId
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
PostParentId
1. This table or related slice is empty.
PostTypePostTypeId
1. PTQuestion
UserLastEditorUserId
1. This table or related slice is empty.
UserOwnerUserId
1. USRakesh Malik
plurals
PostLinksPostIdRelatedPostId
1. PL
 singulars
 LinkTypeLinkTypeId
 LTLinked
PostLinksRelatedPostIdPostId
1. PL
 singulars
 LinkTypeLinkTypeId
 LTLinked
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
2. PO
 singulars
 PostTypePostTypeId
 PTAnswer
3. PO
 singulars
 PostTypePostTypeId
 PTAnswer
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 POGroovy Regex illegal Characters
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
CommentsPostId
1. This table or related slice is empty.

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.