java - Efficient SAX Handling -


i have series of xmls containing postcodes corresponding latitude , longitude, so;

<?xml version="1.0"?> <postcodes>     <entry postcode='ab1 0aa' latitude='7.101478' longitude='2.242852' />     <entry postcode='ab1 0ab' latitude='7.201458' longitude='2.122952' /> </postcodes> 

the xmls split post codes beginning letter, there 1 xml each letter in alphabet. between them, have every post code in uk, means largest of these xml files has 300,000 entry elements.

i looping through list of entity objects put post codes through sax, retrieve longitude , latitude values against each post code. so, if have 2000 entity objects, getting sax handler run 2000 times retrieve values. code loop below;

em = emf.createentitymanager();      (integer id : siteid){              site = em.find(sitetable.class, id);             if(site != null && site.getpostcode() != null && !site.getpostcode().equals("")){                 xmlpositionretriever.runxmlquery(site.getpostcode());              }             else{                 system.out.println("the site and/or postcode against instruction not exist.");             }      } em.close(); 

site.getpostcode() becomes postcodetofind in handler. code sax handler method being used below;

@override  public void startelement(string uri, string localname, string qname, attributes attributes) throws saxexception {     if (postcodetofind.equals(attributes.getvalue("postcode"))){         system.out.println("the postcode '"+postcodetofind+"', has latitude of "+attributes.getvalue("latitude")+" , longitude of "+attributes.getvalue("longitude"));         throw new saxexception();        }       } 

currently time consuming (it takes under 4 minutes 2000 searches), need load times fast. under 30 seconds preferably. far, have managed cut load times down below half by;

  • cutting down number of times handler has run, essential number of times (by reducing number of entities needing checked).
  • making startelement() method throw exception once data need has been found doesn't continue search unnecessarily.
  • breaking xml files smaller files (one each letter of alphabet), handler has fewer elements check per file.

q: have other suggestions more efficient sax handling?

if can pass postal codes want retrieve geo location handler, handler retrieve them in 1 go. saxhandler doing here:

import java.util.hashmap; import java.util.list; import java.util.map;  import org.xml.sax.attributes; import org.xml.sax.saxexception; import org.xml.sax.helpers.defaulthandler;  public class saxdemo extends defaulthandler {    private map<string, location> postalcodemap;    static class location {     string latitude;      string longitude;   }    public saxdemo(list<string> postalcodes) {     this.postalcodemap = new hashmap<string, saxdemo.location>();     (string postalcodetolookfor : postalcodes) {       this.postalcodemap.put(postalcodetolookfor, new location());     }   }    @override   public void startelement(string uri, string localname, string qname, attributes attributes) throws saxexception {     string postcodeofelem = attributes.getvalue("postcode");     if (postcodeofelem != null && this.postalcodemap.containskey(postcodeofelem)) {       location loc = this.postalcodemap.get(postcodeofelem);       loc.latitude = attributes.getvalue("latitude");       loc.longitude = attributes.getvalue("longitude");     }   }    public location getlocationforpostalcode(string postalcode) {     return this.postalcodemap.get(postalcode);   }    public map<string, location> getallfoundgeolocations() {     return this.postalcodemap;   } } 

here pass list of strings constructor of handler , let handler parse document xml data. after parsing completed, retrieved geo locations can found in postalcodemap


Comments

Popular posts from this blog

html - How to style widget with post count different than without post count -

How to remove text and logo OR add Overflow on Android ActionBar using AppCompat on API 8? -

IIS->Tomcat Redirect: multiple worker with default -