Monday, October 3, 2011

Option 2

Moving on to option 2.  Let's try to make one model that will contain all of the family relationships whether they're bidirectional (marriages) or unidirectional (children/parents).

create_table :person do |t|
  t.string :first_name
  t.string :last_name
end
create_table :relationships do |t|
  t.integer :parent_id
  t.integer :child_id
  t.string :name
  t.string :relationships
end

We don't want to have to have the reciprocal relationships like we did in our last modeling option, so let's create some data but not add the reciprocal relationships for now to see if we can get it to work.  We still have to add dual entries for each child though.

We know from the last exercise that we need to specify a foreign_key and two different relationships for children and parents in our Person model, but we're not sure about the Relationships model, so let's leave that vanilla for now and we'll see what we need after we've created the first version of the models.


class Person < ActiveRecord::Base
  has_many :children, :class_name => "Relationship", :foreign_key => "parent_id"
  has_many :parents, :class_name => "Relationship", :foreign_key => "child_id"
end

class Relationships < ActiveRecord::Base
  belongs_to :parent, :class_name => "Person"
  belongs_to :child, :class_name => "Person
end

 Let's add some data:
03_fd - MySQL dump of version 3 of the family_development example database (people and relationships, duplicate child relationships, but non-reciprocal marriages).  Note that marriages are coded with the male first (this is standard pedigree convention).

Starting up our rails console we can test our previous methods:

> brian = Person.find_by_first_name("Brian")
> brian.parents.each do |parent|
     puts parent.get_parent_info.first_name
   end
Bob
Betty
=>  [#<Relationship id: 1, parent_id: 1, child_id: 3, name: "son", relationship_type: "child">, #<Relationship id: 3, parent_id: 2, child_id: 3, name: "son", relationship_type: "child"

Great so parents works just as we'd expect it!

>brian.children.each do |child|
    puts child.get_child_info.first_name
  end
Dan
Daphne
 => [#<Relationship id: 6, parent_id: 3, child_id: 7, name: "son", relationship_type: "child">, #<Relationship id: 10, parent_id: 3, child_id: 4, name: "marriage", relationship_type: "spouse"]

Uh oh!  Now we're getting that Daphne is Dan's child because the :children association doesn't filter on the type of relationship.  Let's fix that.


class Person < ActiveRecord::Base
  has_many :children, :class_name => "Relationship", :foreign_key => "parent_id", :conditions => { :relationship_type => "child" }
  has_many :parents, :class_name => "Relationship", :foreign_key => "child_id", :conditions => { :relationship_type => "child" }
end


>brian.children.each do |child|
    puts child.get_child_info.first_name
  end
Dan
 => [#<Relationship id: 6, parent_id: 3, child_id: 7, name: "son", relationship_type: "child">]

If we try this with Daphne, let's see what we get

> daphne = Person.find_by_first_name("Daphne")
> daphne.children.each do |child|
       puts child.get_child_info.first_name
   end
Dan
 => [#<Relationship id: 6, parent_id: 3, child_id: 7, name: "son", relationship_type: "child">]

> daphne.parents.each do |parent|
     puts parent.get_parent_info.first_name
   end
=> []

So Daphne doesn't have any parents currently.  We could add two child relationships between her and Bob and Betty and indicate that the relationship_type was 'child' and the name was 'daughter-in-law', but for marriages it's probably easier to just get to Bob and Betty via something like daphne.husband.parents.  For adopted children, you'd want to add another field to the model that would be adopted_status and then have a method called is_adopted that would return that flag.

Husbands and Wives

 Ok, so now parents and children are working fine, but as we can see we need to be able to do daphne.husband and brian.wife.  Since we're not adding the duplicate relationships into the database and we have a convention where the husband is always the parent_id and the wife is always the child_id, then we can do this with two has_many relationships:

class Person < ActiveRecord::Base
  has_many :parents, :class_name => "Relationship", :foreign_key => "child_id", :conditions => { :relationship_type => "child" }
  has_many :children, :class_name => "Relationship", :foreign_key => "parent_id", :conditions => { :relationship_type => "child" }
  has_many :husbands, :class_name => "Relationship", :foreign_key => "child_id", :conditions => { :relationship_type => "spouse" }
  has_many :wives, :class_name => "Relationship", :foreign_key => "parent_id", :conditions => { :relationship_type => "spouse" }
end

We know that we need to specify the relationship_type is a spouse.  We know that the foreign_key for husbands is child_id and that the foreign_key for wives is parent_id.  If we wanted to have a generic method to call on a person to see if that person has a spouse (either husband or wife) then we could create a method called spouses:

  def spouses
    husbands = self.husbands
    wives = self.wives
    spouses = husbands + wives
    return spouses
  end

However the caveat with this is that you have to know what kind of a relationship it is in order to get the information about the person from the Person table.  What happens if Barbie divorced Max and entered into a civil union with a woman named Tammy?  If you're not careful about which woman gets entered in the parent_id and which in the child_id, then you can end up with two different kinds of relationships - one where Barbie's ID is in the parent_id column and one where her ID is in the child_id column.

The relationship with her ID in the parent_id column would get her spouse using .wives and the one with her ID in the child_column would get her spouse using .husbands.  Then in order to get the information about the person you'd have to call get_child_info and get_parent_info respectively, which makes it hard to do a loop over the resulting array and get sensible information without a lot of code caveats.  So the conclusion here is that a spouses method is a bad idea because you don't know which type of relationship you're getting returned and you don't have reciprocal relationships.

So what about adding the reciprocal relationships in and condensing husbands and wives into spouses where you always check for one of the ids.  You can set it up so that it either checks the child_id or the parent_id as long as you're consistent in which one it checks and you use the correct method (get_child_info or get_parent_info) depending on which way you set it up (or create a get_spouse_info that is appropriate).

class Person < ActiveRecord::Base
  has_many :parents, :class_name => "Relationship", :foreign_key => "child_id", :conditions => { :relationship_type => "child" }
  has_many :children, :class_name => "Relationship", :foreign_key => "parent_id", :conditions => { :relationship_type => "child" }
   has_many :spouses, :class_name => "Relationship", :foreign_key => "parent_id", :conditions => { :relationship_type => "spouse" }
end

class Relationship < ActiveRecord::Base
  belongs_to :get_parent_info, :class_name => "Person", :foreign_key => "parent_id"
  belongs_to :get_child_info, :class_name => "Person", :foreign_key => "child_id"
  belongs_to :get_spouse_info, :class_name => "Person", :foreign_key => "child_id"
end

04_fd - MySQL dump of version 3 of the family_development example database (people and relationships, duplicate child relationships, with reciprocal marriages). 


With this set up you can do:

> barbie = Person.find_by_first_name("Barbie")
> barbie.spouses.map(&:get_spouse_info)
=> [#<Person id: 9, first_name: "Tammy", last_name: "Saint">, #<Person id: 6, first_name: "Max", last_name: "Payne">, <Person id: 8, first_name: "Terry", last_name: "Jones">]

Conclusion

So you don't have to do include reciprocal relationships in relationships table, but if you don't then you end up either doing a lot of code on the data entry part (CRUD) or the display part (husbands and wives calls) to display the information.  If you do include the reciprocal relationships then you need to do code on the data entry (CRUD) but not on the display.  So is it easier to write code to always put the person in the correct parent_id or child_id field or is it easier to write code that creates, deletes, or updates a reciprocal relationship?  Either way you need to write tests to make sure that the database isn't corrupted (wrong information).  

Is there another way?  Is this easier than the two models from option 1? 

Thursday, September 29, 2011

Option 1

Option 1 is to separate out the relationships by the type of relationship it is.

If we want to be able to use a tree gem, such as acts_as_tree, on the data then we may want to include the child relationship in the Person class.  However, this won't work because the child has two parents, not one.  You could try to force the model to work by making all children depend on their mother or father, but then if you wanted to draw only the side of the family that the child didn't depend on you'd have to write extra complicated code going through the Spouse model. So for now let's say that we're going to have one entry in the child table for each parent of the child (Bob -> Brian and Betty -> Brian).  Since your knowledge of which person is a child's mother or father may change it's good to have the ability to change these relationships independently (for example, if it was discovered that Bob wasn't Brian's father, but that a second man, Ted, was).

So we have one table called Person that contains all of the person's information.  Then we have one table called Spouse that contains the spouses of a Person (or people who were spouses; if Brian and Daphne eventually get divorced they'd still have a relationship in the Spouses table).  Lastly we have one table called Child that contains the children of a Person.

create_table :person do |t|
  t.string :first_name
  t.string :last_name
end
create_table :child do |t|
  t.integer :person_id
  t.integer :child_id
end
create_table :spouse do |t|
  t.integer :person_id
  t.integer :spouse_id
  t.string :spouse_type
end



So let's model those tables in this way:

class Person < ActiveRecord::Base

  has_many :spouses
  has_many :children
end

class Spouse < ActiveRecord::Base
  belongs_to :person

end


class Child < ActiveRecord::Base
  belongs_to :person

end

Add some data:
01_fd - MySQL dump of version 1 of the family_development example database


So there are two different models to model the two different kinds of relationships.

1)  Spouses are undirected relationships.  Bob is Betty's spouse and Betty is Bob's spouse. They also have a relationship_type because the model needs to store whether or not they're currently married or were married (if we want to track and display that information; which you could argue that a genealogical system may not care about social relationships.  People tend to get upset if their adopted children aren't displayed on their family tree though...).
2)  Children are directed relationships.  Brian is Bob's child.  Brian is Betty's child.

So you could make a tree structure out of the Person and Child models, but you couldn't include the Spouse model.



Self-Referencing

In order to get to the information in the Person model from our relationship classes we need to make them self-referential.  Alter the belongs_to statements as follows.

class Person < ActiveRecord::Base
  has_many :spouses
  has_many :children
end

class Spouse < ActiveRecord::Base
  belongs_to :person, :foreign_key => "spouse_id"

end

class Child < ActiveRecord::Base
  belongs_to :person,  :foreign_key => "child_id"

end

You have to specify the foreign_key here because ActiveRecord will automatically look for a column named <class_name>_id.  In this case because we're joining to Person, it looks for person_id.  We can either create our Spouse model so that person_id is named something different or specify the foreign key that AR should use to join back to Person.


> bob = Person.find(1)
> bob.spouses.each do |spouse|
      puts spouse.person.first_name
   end
Betty
=> [#<Spouse id: 1, person_id: 1, spouse_id: 2 >]

If we knew that Bob only had one spouse then we could do:
>bob.spouses.first.person.first_name
>bob.spouses.first.person
=> #<Person id: 2, first_name: "Betty", last_name: "Smith" >

If we wanted information about Bob's children then we could do:
> bob.children.each do |child|
     puts child.person.first_name
   end
Brian
Daphne
=> [#<Child id: 1, person_id: 1, child_id: 3 >, #<Child id: 2, person_id: 1, child_id: 4 >]


Reciprocal Relationships

The question that then comes up when you're creating the data in the database (after doing rake db:migrate to create your tables), is do you include the reciprocal relationship in the Spouse table?  If Bob is Betty's husband is Betty also Bob's wife?

If you don't include the reciprocal relationship then you can only find the relationship from one person.
> bob = Person.find(1)
> bob.spouses
[ #<Spouse id: 1, person_id: 1, spouse_id: 2> ]
(notice that spouses returns an array)

> betty = Person.find(2)
> betty.spouses
[]

So then the Spouse model has to have reciprocal relationships in it, which seems like a bad idea because if Max and Barbie get divorced then you have to remember to update two entries in the Spouse table.

02_fd - MySQL dump of version 2 of the family_development example database (now including reciprocal relationships).

Parents

So we have children and spouses working, but now we want parents.  To do that we need to alter the Person model

class Person < ActiveRecord::Base
  has_many :spouses
  has_many :children
  has_many :parents, :class_name => "Child", :foreign_key => "child_id"
end

> brian = Person.find_by_name("Brian")
> brian.parents
=> [#<Child id: 1, person_id: 1, child_id: 3 >, #<Child id: 3, person_id: 2, child_id: 3>]

Parent Person Information

Things seem to be going pretty good here.  So let's see if we can get Brian's relationships from the current set up.  Brian is the son of Bob and the father of Dan.  Let's see if we can find Dan.

> brian = Person.find_by_name("Brian")
> brian.children.each do |child|
      puts child.person.first_name
   end
Dan
=> [ #<Child id:6, person_id: 3, child_id: 7>]

Ok that works!  So now let's try Bob.

> brian.parents.each do |parent|
      puts parent.person.first_name
   end
Brian
Brian
=> [#<Child id: 1, person_id: 1, child_id: 3 >, #<Child id: 3, person_id: 2, child_id: 3>]

Uh oh.  So we're getting the right child objects, but we're not getting the correct person.  That's because we set up the join between Child and Person to be able to work for the children method (join on child_id).   It isn't possible to have two different relationships that are called the same thing, so we need to have two different methods to get the person information.

Since we need to split it into two methods, let's rename the original method to make it more clear what we're getting with that call.

class Child < ActiveRecord::Base
  belongs_to :get_child_info,  :foreign_key => "child_id"
  belongs_to :get_parent_info, :foreign_key => "person_id"
end

> brian.parents.each do |parent|
      puts parent.get_parent_info.first_name
   end
Bob
Betty
=> [#<Child id: 1, person_id: 1, child_id: 3 >, #<Child id: 3, person_id: 2, child_id: 3>]

> brian.children.each do |child|
      puts child.get_child_info.first_name
   end
Dan
=> [ #<Child id:6, person_id: 3, child_id: 7>]


Huzzah, it works!

Conclusion

Our final models are

class Person < ActiveRecord::Base
  has_many :spouses
  has_many :children
  has_many :parents, :class_name => "Child", :foreign_key => "child_id"
end

class Spouse < ActiveRecord::Base
  belongs_to :person, :foreign_key => "spouse_id"
end
 

class Child < ActiveRecord::Base
  belongs_to :get_child_info,  :foreign_key => "child_id"
  belongs_to :get_parent_info, :foreign_key => "person_id"

end

These models let us do relationship.person.first_name to get the child's first name and relationship.info.first_name to get the parent's first name.  You can name the :person and :info methods in Child and Spouse anything that you want to if doing person.children.first.get_child_info.first_name or person.parents.first.get_parent_info.first_name is not intuitive to you.  The idea is that you want the methods to chain together in a readable way though, so make sure that reading a chain of methods together makes sense if you speak it out loud.

So this seems to be a pretty good option for modeling this particular family tree.  The downside is that you have to have reciprocal relationships in the database and you have two models for the relationships (spouse and child).  So if you wanted to draw one family tree then you'd need to get all of the child information for every individual in the tree and then merge the trees based on the information in the spouse model.  That could be done, but since there are several drawbacks to this modeling of the relationships, let's see what other options we can come up with.

Wednesday, September 28, 2011

Modeling Family Relationships in Rails3

Background

We research the causes of genetic diseases in our group and the specific kinds of diseases that we focus on are family based diseases.  Therefore genealogy and how people are related are very important to us.  We do whole genome sequencing of families and analyze the data.  As we get more and more of these families, we need to design ways to manage that data.

Therefore, I'm putting together a piece of software that contains information on genealogy and family relationships.  Specifically, it's a ruby on rails 3 site that has a MySQL data store.  Most of our analysis code is written in Perl, so there will be a Perl client that uses the Rails API to return JSON data from the database and then use the information in the Perl script. 

Relationship Modeling

I have most of the database implemented, but I'm down to the portion of relationship modeling.  This seems to be fairly tricky because people have lots of different kinds of relationships:

1) Parents
2) Children
3) Spouses/Partners/etc
4) Non-biological family structure (adopted, etc)
5) Divorces/re-marriages

I'm sure there are more than that, but these are just some of the ones that I'm attempting to model here.  I'd like to be able to use the rails gems that work on trees, such as ancestry, or sets, such as acts_as_nested_set in order to be able to process the family trees and draw them down the road.  There's one big problem with that...  marriages are not a tree structure.  Rather than one node having multiple child nodes, it's multiple parent nodes having one or many child nodes. 

Our Sample Family

Bob and Betty Smith have two children, Bryan and Barbie.  Bryan Smith is married to Daphne Smith and they have one child, Dan.  Barbie Smith is married to Max Payne and they have no children.  This is a pretty simple 3 generation family.