Rolling Hash Algorithm for Rabin-Karp Algorithm Java -
i have been trying understand rabin-karp algorithm algorithms class. having alot of trouble understanding tried implement (i don't have implement it). think understand except rolling hash function. algorithm works if pattern char[] matches @ beginning of text char[]. cannot figure out rolling hash going wrong. if please point me in direction of error greatful.
results text "my test string" pattern "my" - comes matched pattern "test" - shows not matched
private static int int_mod(int a, int b) { return (a%b +b)%b; } public static int rabin_karp(char[] text, char[] pattern) { int textsize = text.length; int patternsize = pattern.length; int base = 257; int primemod = 1000000007; if(textsize < patternsize) return -1;n int patternhash = 0; for(int = 0; < patternsize; i++) patternhash += int_mod(patternhash * base + pattern[i], primemod);//this done once put method here system.out.println("patternhash: " + patternhash); //calculate value of first hash int segmenthash = 0; for(int = 0; < patternsize; i++) //remove this, duplicate segmenthash += int_mod(segmenthash * base + text[i], primemod); system.out.println("segmenthash: " + segmenthash); boolean firstmatch = false; if(segmenthash == patternhash) { firstmatch = true; for(int i=0; i<pattern.length; i++) { if(pattern[i] != text[i]) firstmatch = false; } } if (firstmatch == true) { return 0; } for(int i=1; i<textsize - patternsize; i++) { segmenthash += int_mod(segmenthash * base + text[i + pattern.length -1],primemod); segmenthash -= int_mod(segmenthash * base + text[i-1], primemod); system.out.println("segmenthash: " + segmenthash); if(segmenthash == patternhash) { firstmatch = true; for(int j=0; j<pattern.length; j++) { if(pattern[j] != text[j]) firstmatch = false; } } if (firstmatch == true) { return i; } } return -1; }
you have couple of fundamental problems in code.
first 1 here: patternhash += int_mod(patternhash * base + pattern[i], primemod);
duplicated on few more places.
second 1 in calculating rowing hash:
segmenthash += int_mod(segmenthash * base + text[i + pattern.length -1],primemod); segmenthash -= int_mod(segmenthash * base + text[i-1], primemod);
both errors can fixed. however, suggest understand better logic behind code instead of copying somewhere. hashing algorithm you've used based on polynomials, try see happens on level. maybe write few examples hand -- useful while debugging code.
note have problems integer overflowing:
- int can store numbers ~2 billion;
- prime module ~1 billion hashes (patternhash
, segmenthash
in particular) can number;
- base int base = 257;
thus, expression segmenthash * base
can ~257 billion, surely integer overflow.
Comments
Post a Comment