DHara: sublimetext3 - Translate accented to unaccented characters in Sublime Text snippet using regex -

Tuesday, 15 July 2014

sublimetext3 - Translate accented to unaccented characters in Sublime Text snippet using regex -

I am writing an ST3 snippet that includes a \ subsection {} with a label The I label is designed by changing header text to use a (rather long) regular expression:

  $ {1 / (?: ([\ T_] + )? | B): | \ B) (([ÅÄÆÁÀÃ]?)? (: ([Åäæâàáã]?) | B) (: ([eEEE]?) | \ B) (: ([eEEE]) |? B) (: ( [Ìíîï]?) | \ B) (: ([ìíîï]?) | \ B) (: ([n]?) | \ B) (: ([n])? | \ B) (: ([ÖØÓÒÔÖÕ ]?) | \ B) (: ([oooooo]?) | \ B) (: ([ùúûüü]?) | \ B) (? ([Ùúûü?]) | \ B) / (1 ???? ?? -) (2: a) (3: one) (4: e) (5: 6 e) (: i) (7: i) (? 9: n) (? 9: n) (? 10o (? 11: o) (? 12: U) (? 13: U) / g}

Actually, I want to do it even more but if I add additional groups What I want, ST3 crashes when I execute the snippet.

  $ {1 / (?: ([\ T _] +?) | B) (: ( [ÅÄÆÁÀÃ]) | \ b?) (: ([Åäæâàáã]) | \ b?) (: ([C]) | \ B?) (? ([C]) | \ b) (: ([eEEE]) |? \ B) (: ([eEEE]) |? \ B) (: ([ìíîï]?) | B) (: ([ìíîï]?)? | \ B) (: ([n]?) | \ B) (: ([n]) |? \ B) (: ([ÖØÓÒÔÖÕ]?) | \ B ) (? ([[Oooooo]?) | \ B) (: ([ùúûü]) | \ b) (? ([Ùúûü]) | \ b) (? ([Y]) | \ b) (?? ( [Yy]) | \ b) / (1 ???? -) (2: one) (3: c) (4: c) (5: c) (6: e) (7: e ??) ( 8: i) (9:? I) (10: o) (11: o) (12: n) (13: n) (14: u) (15: u) (16: y) (? 17: y) / g}

Is there any more efficient way to do this? Preferably someone who does not cause the crash of ST3;)

Edit: Here are some example strings:

  Flygande More results (with current, working rezux):    Falgade-backsigner-soka-1- Hilla-Pay-Majuka-Tuvar Ake-Stale-Hadi-en-Overflodig-ICE  
  But I also used the letters (ÇçÝÿý) to denote their unpublished counterparts I would like to replace it with, like 
   gets comment ça va  
  
   comment-ca -va

  
  I do not know this syntax, but I suspect that this problem is many Alternate groups meet with many options that are very complex processing 
  So you can try to design your pattern like this, and you should  (take a look at Unicode tables nd character categories) : 
   $ {1 / ([\ t _] +) | ([À-Å]). ([À-å]) | ([È -E]) | ([E-A]). ([I-i]). ([I-i]). ([O-oo]). ([O-oo]). ([U- Ü]). ([U- ü]). (Æ) | (Æ) | (OE) | (OE) | (N) | (8) (8) (9: 1) (8) (9: 1) (2) (3: a) (4:? E) (5: E) (6: i) (7:?) (9: O (10: U) (11: U) (? 12: AE) (? 13: AE) (? 14: oi) (? 15: oy) (? 16: n) (? 17: n) / j }  
  If the letterhead feature is available, then you can correct this method to prevent non-symptomatic characters tested with each option: 
   $ {1 / (? = [\ T_À-ÆÈ-ÏÑ- ÖØ-èè-èè-î-ô-ô-üŒœ]) (?: ([\ T_] +) | ([À-Å ]) | ([À-å]) | ([È-Ë]) | ([E-E]) | ([i-i]) | ([i-i]) | ([o-oo]) | ([O-oo]) | ([u-u]) | ([u ü]) | (æ) | (æ) | (OE) | (OE) | (N) | (N)) / ( 1 ??? -) (2: a) (3: a) (4 : E) (5: E) (6: i) (7: i) (8: o) (9: o) (10: U) (11: U) (12: AE  
  Note:  (? 13: AE) (? 14: oi) (? 15: oi) (? 16: n) (? 17: n) / g} Æ   (Eleg)  should be transcribed as  AE   ( Œ  =>  Oi )




Posted by



Unknown




at

03:22











Email ThisBlogThis!Share to XShare to FacebookShare to Pinterest




No comments:







Post a Comment