Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<h2><em>unichars</em></h2> <p>Does this answer your question:</p> <pre><code>% unichars '\p{InCyrillic}' | wc -l 256 % unichars '\p{InEthiopic}' | wc -l 356 % unichars '\p{InLatin1}' | wc -l 128 % unichars '\p{InCombiningDiacriticalMarks}' | wc -l </code></pre> <p>To include the 16 astral planes, add <code>-a</code>: 112 % unichars -a '\p{InAncientGreekNumbers}' | wc -l 75</p> <p>If you want unassigned or Han or Hangul, you need <code>-u</code>:</p> <pre><code>% unichars -u '\p{InEthiopic}' | wc -l 384 % unichars -u '\p{InCJKUnifiedIdeographsExtensionA}' | wc -l 6592 </code></pre> <hr> <p>You can get other information, too:</p> <pre><code> % unichars '\P{IsGreek}' '\p{InGreek}' ʹ 884 0374 GREEK NUMERAL SIGN ; 894 037E GREEK QUESTION MARK ΅ 901 0385 GREEK DIALYTIKA TONOS · 903 0387 GREEK ANO TELEIA Ϣ 994 03E2 COPTIC CAPITAL LETTER SHEI ϣ 995 03E3 COPTIC SMALL LETTER SHEI Ϥ 996 03E4 COPTIC CAPITAL LETTER FEI ϥ 997 03E5 COPTIC SMALL LETTER FEI Ϧ 998 03E6 COPTIC CAPITAL LETTER KHEI ϧ 999 03E7 COPTIC SMALL LETTER KHEI Ϩ 1000 03E8 COPTIC CAPITAL LETTER HORI ϩ 1001 03E9 COPTIC SMALL LETTER HORI Ϫ 1002 03EA COPTIC CAPITAL LETTER GANGIA ϫ 1003 03EB COPTIC SMALL LETTER GANGIA Ϭ 1004 03EC COPTIC CAPITAL LETTER SHIMA ϭ 1005 03ED COPTIC SMALL LETTER SHIMA Ϯ 1006 03EE COPTIC CAPITAL LETTER DEI ϯ 1007 03EF COPTIC SMALL LETTER DEI % unichars '\p{IsGreek}' '\P{InGreek}' | wc -l 250 % unichars '\P{IsGreek}' '\p{InGreek}' | wc -l 18 % unichars '\p{In=1.1}' | wc -l 6362 % unichars '\p{In=6.0}' | wc -l 15087 </code></pre> <hr> <h2><em>uniprops</em></h2> <p>Here’s <em>uniprops</em>:</p> <pre><code>% uniprops -l | grep -c 'Block=' 84 % uniprops digamma 450 % U+03DC ‹Ϝ› \N{ GREEK LETTER DIGAMMA }: \w \pL \p{LC} \p{L_} \p{L&amp;} \p{Lu} All Any Alnum Alpha Alphabetic Assigned Greek Is_Greek InGreek Cased Cased_Letter LC Changes_When_Casefolded CWCF Changes_When_Casemapped CWCM Changes_When_Lowercased CWL Changes_When_NFKC_Casefolded CWKCF Lu L Gr_Base Grapheme_Base Graph GrBase Grek Greek_And_Coptic ID_Continue IDC ID_Start IDS Letter L_ Uppercase_Letter Print Upper Uppercase Word XID_Continue XIDC XID_Start XIDS XPosixAlnum XPosixAlpha XPosixGraph XPosixPrint XPosixUpper XPosixWord U+0450 ‹ѐ› \N{ CYRILLIC SMALL LETTER IE WITH GRAVE }: \w \pL \p{LC} \p{L_} \p{L&amp;} \p{Ll} All Any Alnum Alpha Alphabetic Assigned InCyrillic Cyrillic Is_Cyrillic Cased Cased_Letter LC Changes_When_Casemapped CWCM Changes_When_Titlecased CWT Changes_When_Uppercased CWU Cyrl Ll L Gr_Base Grapheme_Base Graph GrBase ID_Continue IDC ID_Start IDS Letter L_ Lowercase_Letter Lower Lowercase Print Word XID_Continue XIDC XID_Start XIDS XPosixAlnum XPosixAlpha XPosixGraph XPosixLower XPosixPrint XPosixWord U+0025 ‹%› \N{ PERCENT SIGN }: \pP \p{Po} All Any ASCII Assigned Common Zyyy Po P Gr_Base Grapheme_Base Graph GrBase Other_Punctuation Punct Pat_Syn Pattern_Syntax PatSyn PosixGraph PosixPrint PosixPunct Print Punctuation XPosixGraph XPosixPrint XPosixPunct </code></pre> <hr> <p>Or even all these:</p> <pre><code>% uniprops -vag 777 U+0777 ‹ݷ› \N{ ARABIC LETTER FARSI YEH WITH EXTENDED ARABIC-INDIC DIGIT FOUR BELOW }: \w \pL \p{L_} \p{Lo} \p{All} \p{Any} \p{Alnum} \p{Alpha} \p{Alphabetic} \p{Arab} \p{Arabic} \p{Assigned} \p{Is_Arabic} \p{InArabicSupplement} \p{L} \p{Lo} \p{Gr_Base} \p{Grapheme_Base} \p{Graph} \p{GrBase} \p{ID_Continue} \p{IDC} \p{ID_Start} \p{IDS} \p{Letter} \p{L_} \p{Other_Letter} \p{Print} \p{Word} \p{XID_Continue} \p{XIDC} \p{XID_Start} \p{XIDS} \p{XPosixAlnum} \p{XPosixAlpha} \p{XPosixGraph} \p{XPosixPrint} \p{XPosixWord} \p{Age:5.1} \p{Script=Arabic} \p{Bidi_Class:AL} \p{Bidi_Class=Arabic_Letter} \p{Bidi_Class:Arabic_Letter} \p{Bc=AL} \p{Block:Arabic_Supplement} \p{Canonical_Combining_Class:0} \p{Canonical_Combining_Class=Not_Reordered} \p{Canonical_Combining_Class:Not_Reordered} \p{Ccc=NR} \p{Canonical_Combining_Class:NR} \p{Decomposition_Type:None} \p{Dt=None} \p{East_Asian_Width=Neutral} \p{East_Asian_Width:Neutral} \p{General_Category:L} \p{General_Category=Letter} \p{General_Category:Letter} \p{Gc=L} \p{General_Category:Lo} \p{General_Category=Other_Letter} \p{General_Category:Other_Letter} \p{Gc=Lo} \p{Grapheme_Cluster_Break:Other} \p{GCB=XX} \p{Grapheme_Cluster_Break:XX} \p{Grapheme_Cluster_Break=Other} \p{Hangul_Syllable_Type:NA} \p{Hangul_Syllable_Type=Not_Applicable} \p{Hangul_Syllable_Type:Not_Applicable} \p{Hst=NA} \p{Joining_Group:Yeh} \p{Jg=Yeh} \p{Joining_Type:D} \p{Joining_Type=Dual_Joining} \p{Joining_Type:Dual_Joining} \p{Jt=D} \p{Line_Break:AL} \p{Line_Break=Alphabetic} \p{Line_Break:Alphabetic} \p{Lb=AL} \p{Numeric_Type:None} \p{Nt=None} \p{Numeric_Value:NaN} \p{Nv=NaN} \p{Present_In:5.1} \p{In=5.1} \p{Present_In:5.2} \p{In=5.2} \p{Present_In:6.0} \p{In=6.0} \p{Script:Arab} \p{Script:Arabic} \p{Sc=Arab} \p{Sentence_Break:LE} \p{Sentence_Break=OLetter} \p{Sentence_Break:OLetter} \p{SB=LE} \p{Word_Break:ALetter} \p{WB=LE} \p{Word_Break:LE} \p{Word_Break=ALetter} </code></pre> <p>My <a href="http://training.perl.com/scripts/uniprops" rel="nofollow"><em>uniprops</em></a> and <a href="http://training.perl.com/scripts/unichars" rel="nofollow"><em>unichars</em></a> should run anywhere running Perl version 5.10 or better. There’s also a <a href="http://training.perl.com/scripts/uninames" rel="nofollow"><em>uninames</em></a> script that goes with them.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload