Tuesday, 15 July 2014

sublimetext3 - Translate accented to unaccented characters in Sublime Text snippet using regex -


I am writing an ST3 snippet that includes a \ subsection {} with a label The I label is designed by changing header text to use a (rather long) regular expression:

  $ {1 / (?: ([\ T_] + )? | B): | \ B) (([ÅÄÆÁÀÃ]?)? (: ([Åäæâàáã]?) | B) (: ([eEEE]?) | \ B) (: ([eEEE]) |? B) (: ( [Ìíîï]?) | \ B) (: ([ìíîï]?) | \ B) (: ([n]?) | \ B) (: ([n])? | \ B) (: ([ÖØÓÒÔÖÕ ]?) | \ B) (: ([oooooo]?) | \ B) (: ([ùúûüü]?) | \ B) (? ([Ùúûü?]) | \ B) / (1 ???? ?? -) (2: a) (3: one) (4: e) (5: 6 e) (: i) (7: i) (? 9: n) (? 9: n) (? 10o (? 11: o) (? 12: U) (? 13: U) / g}  

Actually, I want to do it even more but if I add additional groups What I want, ST3 crashes when I execute the snippet.

  $ {1 / (?: ([\ T _] +?) | B) (: ( [ÅÄÆÁÀÃ]) | \ b?) (: ([Åäæâàáã]) | \ b?) (: ([C]) | \ B?) (? ([C]) | \ b) (: ([eEEE]) |? \ B) (: ([eEEE]) |? \ B) (: ([ìíîï]?) | B) (: ([ìíîï]?)? | \ B) (: ([n]?) | \ B) (: ([n]) |? \ B) (: ([ÖØÓÒÔÖÕ]?) | \ B ) (? ([[Oooooo]?) | \ B) (: ([ùúûü]) | \ b) (? ([Ùúûü]) | \ b) (? ([Y]) | \ b) (?? ( [Yy]) | \ b) / (1 ???? -) (2: one) (3: c) (4: c) (5: c) (6: e) (7: e ??) ( 8: i) (9:? I) (10: o) (11: o) (12: n) (13: n) (14: u) (15: u) (16: y) (? 17: y) / g}  

Is there any more efficient way to do this? Preferably someone who does not cause the crash of ST3;)

Edit: Here are some example strings:

  Flygande More results (with current, working rezux):  
  Falgade-backsigner-soka-1- Hilla-Pay-Majuka-Tuvar Ake-Stale-Hadi-en-Overflodig-ICE  

But I also used the letters (ÇçÝÿý) to denote their unpublished counterparts I would like to replace it with, like

  gets comment ça va  

  comment-ca -va  

I do not know this syntax, but I suspect that this problem is many Alternate groups meet with many options that are very complex processing

So you can try to design your pattern like this, and you should (take a look at Unicode tables nd character categories) :

  $ {1 / ([\ t _] +) | ([À-Å]). ([À-å]) | ([È -E]) | ([E-A]). ([I-i]). ([I-i]). ([O-oo]). ([O-oo]). ([U- Ü]). ([U- ü]). (Æ) | (Æ) | (OE) | (OE) | (N) | (8) (8) (9: 1) (8) (9: 1) (2) (3: a) (4:? E) (5: E) (6: i) (7:?) (9: O (10: U) (11: U) (? 12: AE) (? 13: AE) (? 14: oi) (? 15: oy) (? 16: n) (? 17: n) / j }  

If the letterhead feature is available, then you can correct this method to prevent non-symptomatic characters tested with each option:

  $ {1 / (? = [\ T_À-ÆÈ-ÏÑ- ÖØ-èè-èè-î-ô-ô-üŒœ]) (?: ([\ T_] +) | ([À-Å ]) | ([À-å]) | ([È-Ë]) | ([E-E]) | ([i-i]) | ([i-i]) | ([o-oo]) | ([O-oo]) | ([u-u]) | ([u ü]) | (æ) | (æ) | (OE) | (OE) | (N) | (N)) / ( 1 ??? -) (2: a) (3: a) (4 : E) (5: E) (6: i) (7: i) (8: o) (9: o) (10: U) (11: U) (12: AE  

Note: (? 13: AE) (? 14: oi) (? 15: oi) (? 16: n) (? 17: n) / g} Æ (Eleg) should be transcribed as AE ( Œ => Oi )


No comments:

Post a Comment