Tuesday, 15 March 2011

python - re.search becomes unresponsive -


When I run this code, it can not be either 'checked' and not printed Does not match ' This stops responding altogether.

  url = 'http: //hoswifi.bblink.cn/v3/2-fd1cc0657845832e5e1248e6539a50fa/topic/55-13950.html? From home = home 'em = search (r' / \ d- (B | (\ w +) {10,64}) / index.html ', url) if m: print (' check ') other: print ( 'Do not match')  
  s = '  

Our string has 10 digits, and it contains Z 'is not . It is deliberate so that it forces re.search to check possible combinations all , else it will stop the first match.

I can not calculate the number of possible combinations, because the math involved is difficult, but here is a small demonstration, when s gets more points:

Enter image details here

The time for a digit is approximately 1μ s goes from s to 30 seconds, that is, 10 8 more time


I think that something like this happens when you use (\ w +) {10,64} . Instead, you should use \ w {10,64} .


The code used for the demo:

  import time import import matplotlib.pyplot such as plt setup = "" import re "" _base_stmt = "M = re (1, '11', '111' ...) in the statement = {} category (1, '' ('\ w +) * z', '{}')" # 18): statements .update ({i: _base_stmt.format ('1' * i)}) for # x y, x = [] y = [] sorts (statement): x.append (i) Y.append ( Timeit.timeend (description [i], setup, number = 1) #plot plt.plot (x, y) plt.xlabel ('string length') plt.ylabel ('time (seconds)') Plt.show )  

No comments:

Post a Comment