regex - C# Dictionary fast search of closest value to key -
i have long string (thousands of lines). i'm running regex expressions against string , trying identify line numbers of matches. if have high match count (say, 10,000), find line numbers every time involves searching html string again, gets expensive.
what want search string beforehand , build hashtable of character positions of line numbers. use dictionary , use following code find line numbers.
//find line endings int linecount = 0; (int charcount = 0; charcount <= html.length; charcount++) { if (html[charcount] == '\n') { linecount++; lineendings.add(charcount, linecount); } } however, when run regexes, how search dictionary? regex expression character position need between 2 values in lineendings dictionary. what's best / efficient way to; given dictionary set of gapped keys, given value that's not in key list, find next closest key?
one thing i've tried, , i'm not sure how perform,
lineendings.first(n => n.key >= match.index).value
dictionaries don't work when definition of "equal" "close".
it's important items in dictionary transative. if = b , b = c should equal c. if that's not case (which isn't, if equality defined "close", things start breaking down.
to start with, there's no way can write effective gethashcode implementation here. way ever valid return same value, means you've degraded performance linear search anyway.
what can do, given have static set of strings, put them in list or array, sort them, , use binarysearch. since data appears static, fact adding items lookup table expensive shouldn't problem. binary search capable of telling item searching belong if should added, means can go index @ position find "next" item, , subtract 1 find "previous" item.
Comments
Post a Comment