blah blah blah is here! blah blah » Close

up1down
link

I am working on a text editor that has a label which updates with the current word count whenever the text is changed. Right now if I am editing a large file it takes a long time to count all of the regex matches, which causes typing to be very laggy. I have been thinking about trying to fix this by having the wordcount regex run by a backgroundworker but I have been unable to figure out how to access the richtextbox's text property from this separate thread. How would I do this? Or is there a better way for me to keep my program from being laggy?

last answered 2 years ago

1 answers

link

Rather than using a separate thread, for a relatively simple operation such as counting words I'd consider using ordinary string manipulation rather than Regex which tends to be much faster.

According to my rough tests, WordCount1 counts the words in a million word document about 4 times faster than WordCount2 and the difference in speed is even more marked for smaller documents:

static int WordCount1(string s)
{
s = s.TrimEnd();
if (String.IsNullOrEmpty(s)) return 0;
int count = 0;
bool lastWasWordChar = false;
foreach(char c in s)
{
if (Char.IsLetterOrDigit(c) || c == '_' || c == '\'')
{
lastWasWordChar = true;
continue;
}
if (lastWasWordChar)
{
lastWasWordChar = false;
count++;
}
}
if (!lastWasWordChar) count--;
return count + 1;
}

static int WordCount2(string s)
{
return Regex.Matches(s, "[a-zA-Z_0-9']+").Count;
}

sanjib
226

vulpes, in your free time please explain the inner story. Why such things happen?

vulpes
17279

Although the .NET regex engine is very expressive, searching is relatively slow compared to 'hand written' code. This is no surprise given the generality and complexity of expressions that the regex engine may need to deal with. So, unless performance is unimportant, it's better to write the simpler queries using standard code and bring in regex for the more complicated stuff.

Feedback