vendredi 8 mai 2015

Regex Greediness

it might be hard question related regular expression but I couldn't solve it. Here is my regular expression:

regex = (^|(?<= ))Football( ((\S+ )+?(?=Football)|(\S+ )+)| )fun( ((\S+ )+?(?=Football)|(\S+ )+)| )Football\ is\ important((?= )|$)

With that I'd like to catch this:

text1 = "Football is fun I like Football is important"

but not this:

text2 = "Football is fun I like Football Football is important"

As far as I understand, expression shouldn't have matched because there is one more Football in there. Second ( ((\S+ )+?(?=Football)|(\S+ )+)| ) part should have matched I like because after this Football in there and it's not greedy because I added ? after second +. The last part should have matched Football is important so there is one Football (in the middle) hanging around. How can I modify it so that it makes what I need?

Sorry for the silly example; I changed my real text.

Aucun commentaire:

Enregistrer un commentaire