Java - Google App Engine - modelling graph structures in Google Datastore -


google apps engine offers google datastore nosql database (i think based on bigtable).

in application have social-like data structure , want model in graph database. application must save heterogeneous objects (users,files,...) , relationships among them (such user1 owns file2, user2 follows user3, , on).

i'm looking way model typical situation, , thought 2 families of solutions:

  1. list-based solutions: object contains list of other related objects , object presence in list relationship (as google said in jdo part https://developers.google.com/appengine/docs/java/datastore/jdo/relationships).

  2. graph-based solution: both nodes , relationships objects. objects exist independently relationships while each relationship contain reference 2 (or more) connected objects.

what strong , weak points of these 2 approaches?

about approach 1: simpler approach 1 can think of, , presented in official documentation but:

  • each directed relationship make object record grow: there limitations on number of possible relationships given instance object dimension limit?
  • is jdo feature or datastore structure allows approach naturally implemented?
  • the relationship search time increase list, solution suitable large (million) of relationships?

about approach 2: each relationship can have higher level of characterization (it object , can have properties). , think memory size not google problem, but:

  • each relationship requires own record, search time each related couple increase total number of relationships increase. suitable large amount of relationships(millions, billions)? i.e. google have tricks search among records if structured? or in situation in if want search friend of user1 called user4 have wait seconds?
  • on other side each object doesn't increase in dimension new relationships added.

could me find other important points on 2 approaches in such way chose best model?

first, search time in datastore not depend on number of entities store, on number of entities retrieve. therefore, if need find 1 relationship object out of billion, take same time if had 1 object.

second, list approach has serious limitation called "exploding indexes". have index property contains list make searchable. if ever use query references more property, run issue - google understand implications.

third, list approach more expensive. every time add new relationship, rewrite entire entity @ considerable writing cost. reading costs higher if cannot use keys-only queries. object approach can use keys-only queries find relationships, , such queries free.

update:

if relationships directed, may consider making relationship entities children of user entities, , using object id id relationship entity well. relationship entity have no properties @ all, cost-efficient solution. able retrieve objects owned user using keys-only ancestor queries.


Comments

Popular posts from this blog

javascript - jquery or ashx not working -

opencv - DataType<cv::detail::deriv_type>::depth what is it used for -

python 3.x - Mapping specific letters onto a list of words -