java - Efficient SAX Handling -
i have series of xmls containing postcodes corresponding latitude , longitude, so;
<?xml version="1.0"?> <postcodes> <entry postcode='ab1 0aa' latitude='7.101478' longitude='2.242852' /> <entry postcode='ab1 0ab' latitude='7.201458' longitude='2.122952' /> </postcodes>
the xmls split post codes beginning letter, there 1 xml each letter in alphabet. between them, have every post code in uk, means largest of these xml files has 300,000 entry
elements.
i looping through list of entity objects put post codes through sax, retrieve longitude
, latitude
values against each post code. so, if have 2000 entity objects, getting sax handler run 2000 times retrieve values. code loop below;
em = emf.createentitymanager(); (integer id : siteid){ site = em.find(sitetable.class, id); if(site != null && site.getpostcode() != null && !site.getpostcode().equals("")){ xmlpositionretriever.runxmlquery(site.getpostcode()); } else{ system.out.println("the site and/or postcode against instruction not exist."); } } em.close();
site.getpostcode()
becomes postcodetofind
in handler. code sax handler method being used below;
@override public void startelement(string uri, string localname, string qname, attributes attributes) throws saxexception { if (postcodetofind.equals(attributes.getvalue("postcode"))){ system.out.println("the postcode '"+postcodetofind+"', has latitude of "+attributes.getvalue("latitude")+" , longitude of "+attributes.getvalue("longitude")); throw new saxexception(); } }
currently time consuming (it takes under 4 minutes 2000 searches), need load times fast. under 30 seconds preferably. far, have managed cut load times down below half by;
- cutting down number of times handler has run, essential number of times (by reducing number of entities needing checked).
- making startelement() method throw exception once data need has been found doesn't continue search unnecessarily.
- breaking xml files smaller files (one each letter of alphabet), handler has fewer elements check per file.
q: have other suggestions more efficient sax handling?
if can pass postal codes want retrieve geo location handler, handler retrieve them in 1 go. saxhandler doing here:
import java.util.hashmap; import java.util.list; import java.util.map; import org.xml.sax.attributes; import org.xml.sax.saxexception; import org.xml.sax.helpers.defaulthandler; public class saxdemo extends defaulthandler { private map<string, location> postalcodemap; static class location { string latitude; string longitude; } public saxdemo(list<string> postalcodes) { this.postalcodemap = new hashmap<string, saxdemo.location>(); (string postalcodetolookfor : postalcodes) { this.postalcodemap.put(postalcodetolookfor, new location()); } } @override public void startelement(string uri, string localname, string qname, attributes attributes) throws saxexception { string postcodeofelem = attributes.getvalue("postcode"); if (postcodeofelem != null && this.postalcodemap.containskey(postcodeofelem)) { location loc = this.postalcodemap.get(postcodeofelem); loc.latitude = attributes.getvalue("latitude"); loc.longitude = attributes.getvalue("longitude"); } } public location getlocationforpostalcode(string postalcode) { return this.postalcodemap.get(postalcode); } public map<string, location> getallfoundgeolocations() { return this.postalcodemap; } }
here pass list of strings constructor of handler , let handler parse document xml data. after parsing completed, retrieved geo locations can found in postalcodemap
Comments
Post a Comment