-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] CI tests for constraint-based structure learning #16
Conversation
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Looking at this now. |
Makefile
Outdated
@echo "Building documentation" | ||
make -C doc/ clean | ||
make -C doc/ html-noplot | ||
cd doc/ && make view |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Newline.
Does the makefile need to be commited?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. This is just for my dev sake rn. I'm still wrangling upstream dependencies. I.e. pywhy-graphs. I'll remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually jk I think it would be nice to have this. It makes the docs building a bit easier. If someone knows how to add a multi-step procedure to the poetry command build_docs
, then maybe not tho...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's something scripted like this I would generally model this under a bash file and invoke the bash file via a poetry script.
I'll get these CI green and then work on the PC part. This turned out to be a lot of work since we're trying to keep integration with networkx proposal, and pywhy-graphs. But I think it'll be worth it once we see the PC algo implemented with these in mind :). |
import networkx as nx | ||
|
||
|
||
class GraphProtocol(Protocol): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. What is the rationale for using duck typing instead of subclassing? Is it normal to use protocols with methods that all need to be overridden?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Protocols just are Python's way of defining an abstract interface. So in the code, note that pywhy-graphs
is never imported. And does not need to be. This is in line w/ discussions we had awhile back where the Dev team requested that we don't explicitly have to rely on pywhy's implementation of causal graphs.
This GraphProtocol
class then allows a user to define their own graph class, as long as it adheres to the API, then it "should" still work. In addition, this provides mypy
type checking w/o having to import pywhy-graphs
explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to add an explicit example:
if someone from say ananke
wants to make their graph implementation work with the implementation of the algorithms here that rely on a GraphProtocol, then they just need to modify their API to have the functions specified.
I personally am okay w/ being opinionated and requiring users to use pywhy-graphs directly and providing a transformation module for converting graphs from common formats, but I think we can always swap that easily using ctrl+f+replace
|
||
@abstractmethod | ||
def test( | ||
self, df: pd.DataFrame, x_var: Any, y_var: Any, z_covariates: Any = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are x_var
, y_var
and z_covariates
of the Any
type? Categoricals, integers, and floats right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They can be anything that a pandas dataframe column supports. I wasn't 100% sure if Categoricals, integers, and floats
are sufficient. Are they?
|
||
def test( | ||
self, df: pd.DataFrame, x_var: Any, y_var: Any, z_covariates: Any = None | ||
) -> Tuple[float, float]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be useful to create types for the test stat and the p-value as this will be used in multiple places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aren't they just floats tho? With pvalue being constrained between 0 and 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. My only issues are with some of the type hints.
@adam2392 can you comment here on why some flavors are left out? For example:
- mutual information and asymptotic chi-square test for discrete variables, perhaps with a James-Stein estimator
- Monte Carlo permutation testing and sequential Monte Carlo permutation testing
- Jonckheere-Terpstra test for ordinal variables
Will address the code comments tonight. Re the additional CI tests: just never implemented them myself :p. It would be nice to get a community contribution for those. Or one of us can tackle it when there's time? |
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Codecov Report
@@ Coverage Diff @@
## main #16 +/- ##
===========================================
+ Coverage 60.00% 76.90% +16.90%
===========================================
Files 1 7 +6
Lines 5 342 +337
Branches 0 55 +55
===========================================
+ Hits 3 263 +260
- Misses 2 63 +61
- Partials 0 16 +16
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
Gonna merge this for now to move forward w/ the next step. Left some comments open w/ my responses tho. |
Signed-off-by: Adam Li adam2392@gmail.com
Follow up to: #14 to lead to #11
Changes proposed in this pull request:
dodiscover.ci
in a consistent API.pylint
from the style checking as it is too stringent.Needs to update the networkx algorithms that will work on mixedEdgeGraphs.
Before submitting
section of the
CONTRIBUTING
docs.Writing docstrings section of the
CONTRIBUTING
docs.After submitting