Bad performance compared to MySql with Neo4j -


i migrated mysql database neo4j , tested simple request. surprised see equivalent request 10 100 times longer in neo4j in mysql. i'm working on neo4j 2.0.1.

in original mysql schema have 3 following tables :

  • countries : containing 'code', 'continent_id' , 'selected' boolean,
  • cities : containing 'country_code', 'name' , 'status' boolean,
  • theaters : containing 'city_id' , 'public' boolean,

with indexes on each property. want display theater count city given continent several conditions. request :

select count(*) nb, c.name  `cities` c left join theaters t on c.id = t.city_id  c.country_code in    (select code countries selected true , continent_id = 4)  , c.status=1 , t.public = 1  group c.name  order nb desc 


database schema in neo4j following :

(:continent)-[:include]->(:country{selected:bool})-[:include]->(:city{name:string, status:bool})-[:include]->(:theater{public:bool})

there's index defined on each property. cypher request :

match (:continent{code: 4})-[:include]->(:country{selected:true})-[:include]->(city:city{status:true})-[:include]->(:theater{public: true}) return city.name, count(*) nb order nb desc 


there 70.000 cities , 140.000 theaters in each database.

on continent id 4 mysql request took 0.02s whereas neo4j took 0.4s. moreover, if introduce in cypher request variable relation length between country , city (...(:country{selected:true})-[:include*..3]->(city:city{status:true})...) want able add intermediate levels regions, there request take more 2 seconds.

i'm aware in particular case there no benefit use neo4j instead of mysql, expect see approximatively equivalent performance between 2 technologies, , want take advantage of neo4j geographical hierarchy capabilities.

am missing or limitation of neo4j?

thank answers.

edit : first you'll find database dump files here. neo4j server configuration 1 out of box. work in ruby environment, , use neography gem. run neo4j server separatly since i'm not on jruby, sends cypher request through rest api.

the database contains 244 countries, 69000 cities , 138,000 theaters. continent_id 4, there 46,982 cities (37,210 have status boolean set true) , 74,420 theaters.

the request returned 2256 rows. @ third run, took 338 ms. here request output profiling infos :

profile match (:continent{code: 4})-[:include]->(country:country{selected:true})-[:include*..1]->(city:city{status:true})-[:include]->(theater:theater{public: true}) return city.name, count(*) nb order nb desc;  ==> columnfilter(symkeys=["city.name", "  internal_aggregate85ca19f3-9421-4c18-a449-1097e3deede2"], returnitemnames=["city.name", "nb"], _rows=2256, _db_hits=0) ==> sort(descr=["sortitem(cached(  internal_aggregate85ca19f3-9421-4c18-a449-1097e3deede2 of type integer),false)"], _rows=2256, _db_hits=0) ==>   eageraggregation(keys=["cached(city.name of type any)"], aggregates=["(  internal_aggregate85ca19f3-9421-4c18-a449-1097e3deede2,countstar())"], _rows=2256, _db_hits=0) ==>     extract(symkeys=["city", "  unnamed27", "  unnamed7", "country", "  unnamed113", "theater", "  unnamed72"], exprkeys=["city.name"], _rows=2257, _db_hits=2257) ==>       filter(pred="(haslabel(theater:theater(3)) , property(theater,public(5)) == true)", _rows=2257, _db_hits=2257) ==>         simplepatternmatcher(g="(city)-['  unnamed113']-(theater)", _rows=2257, _db_hits=4514) ==>           filter(pred="(((haslabel(city:city(2)) , haslabel(city:city(2))) , property(city,status(4)) == true) , property(city,status(4)) == true)", _rows=2257, _db_hits=74420) ==>             traversalmatcher(start={"label": "continent", "query": "literal(4)", "identifiers": ["  unnamed7"], "property": "code", "producer": "schemaindex"}, trail="(  unnamed7)-[  unnamed27:include (((haslabel(nodeidentifier():country(1)) , haslabel(nodeidentifier():country(1))) , property(nodeidentifier(),selected(3)) == true) , property(nodeidentifier(),selected(3)) == true) , true]->(country)-[:include*1..1]->(city)", _rows=37210, _db_hits=37432) 

you're right, tried myself , got down 100ms query.

 match (:continent{code: 4})-[:include]->        (country:country{selected:true})-[:include]->        (city:city{status:true})-[:include]->        (theater:theater{public: true})   return city.name, count(*) nb   order nb desc;  | "forbach"                       | 1  | | "stuttgart"                     | 1  | | "mirepoix"                      | 1  | | "bonnieux"                      | 1  | | "saint cyprien plage"           | 1  | | "crissay sur manse"             | 1  | +--------------------------------------+ 2256 rows **85 ms** 

please note cypher of 2.0.x has not yet performance optimized, work started in neo4j 2.1 , continue until 2.3. there more performance work planned in kernel speed things up.

i implemented solution in java , got down 19ms. of course not pretty that's aiming cypher too:

class city {     node city;     int count = 1;      public city(node city) {         this.city = city;     }      public void inc() { count++; }      @override     public string tostring() {         return string.format("city{city=%s, count=%d}", city.getproperty("name"), count);     } }  private list<?> queryjava3() {     long start = system.currenttimemillis();     node continent = iteratorutil.single(db.findnodesbylabelandproperty(continent, "code", 4));     map<node,city> result = new hashmap<>();     (relationship rel1 : continent.getrelationships(direction.outgoing,include)) {         node country = rel1.getendnode();         if (!(country.haslabel(country) && (boolean) country.getproperty("selected", false))) continue;         (relationship rel2 : country.getrelationships(direction.outgoing, include)) {             node city = rel2.getendnode();             if (!(city.haslabel(city) && (boolean) city.getproperty("status", false))) continue;             (relationship rel3 : city.getrelationships(direction.outgoing, include)) {                 node theater = rel3.getendnode();                 if (!(theater.haslabel(theater) && (boolean) theater.getproperty("public", false))) continue;                 city city1 = result.get(city);                 if (city1==null) result.put(city,new city(city));                 else city1.inc();             }         }     }     list<city> list = new arraylist<>(result.values());     collections.sort(list, new comparator<city>() {         @override         public int compare(city o1, city o2) {             return integer.compare(o2.count,o1.count);         }     });     output("java", start, list.iterator());     return list; }   java time = 19ms first = city{city=val de meuse, count=1} total-count 22561 

Comments

Popular posts from this blog

apache - Remove .php and add trailing slash in url using htaccess not loading css -

javascript - jQuery show full size image on click -