CARVIEW |
Navigation Menu
-
-
Notifications
You must be signed in to change notification settings - Fork 862
Mentored Projects
If you are looking for a project to work on, have a look at some potential mentored projects below. Please feel free to suggest any other ideas. If you find any of the projects interesting or want to discuss them, please contact us on our Discord or mailing list.
1. Implement classes for representing Maximal Ancestral Graphs (MAGs) and Partial Ancestral Graphs (PAGs)
Some structure learning / causal discovery algorithms, such as Fast Causal Inference (FCI), output a PAG/MAG. Unlike Directed Acyclic Graphs (DAGs) and PDAGs (Partial Directed Acyclic Graphs), PAGs and MAGs can potentially have four different types of edges: directed (->, <-), bidirected (<->), partially directed (o->, <-o), and nondirected (o-o). Currently, pgmpy internally uses networkx for representing DAGs and CPDAGs; however, networkx doesn't offer a straightforward way to represent these different types of edges. One potential way to represent all these different edge types in networkx is to specify edge attributes. For example:
import networkx as nx
# directed ( X -> Y )
nx.DiGraph([('X', 'Y')])
# bidirected ( X <-> Y )
nx.DiGraph([('X', 'Y'), ('Y', 'X')])
# partial directed ( X o-> Y )
G = nx.DiGraph([('X', 'Y'), ('Y', 'X')])
G['Y']['X']['type'] = 'o'
# non-directed (X o-o Y )
G = nx.DiGraph([('X', 'Y'), ('Y', 'X')])
G['X']['Y']['type'] = 'o'
G['Y']['X']['type'] = 'o'
We would also want graph traversal algorithms to be consistent with this representation. Ideally, be the end of the project we should have:
- Two base classes MAG and PAG for representing these graphs, similar to DAG and CPDAG: https://github.com/pgmpy/pgmpy/blob/dev/pgmpy/base/DAG.py
- We would want to figure out a way such that graph traversal algorithms and algorithms dependent on them (such as d-separation) work as expected.
References
Theoretically, there is a significant overlap between Linear Gaussian Bayesian Networks (BNs) and Structural Equation Models (SEM), with two main differences:
- SEMs include explicit error terms.
- SEMs permit error correlation.
Given these similarities, Linear Gaussian BNs can be viewed as a special case of SEMs. Currently, pgmpy has separate implementations for Linear Gaussian BN and SEM. This project aims to integrate these two models and make them interoperable while sharing some of the current algorithm implementations. More details on exact changes still to be decided.