Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "target" variable to ProfileReport and then add more graphs #73

Closed
eamag opened this issue Nov 27, 2017 · 6 comments
Closed

Add "target" variable to ProfileReport and then add more graphs #73

eamag opened this issue Nov 27, 2017 · 6 comments
Labels
feature request 💬 Requests for new features

Comments

@eamag
Copy link

eamag commented Nov 27, 2017

https://seaborn.pydata.org/generated/seaborn.countplot.html is very useful if target is categorical. so it'd be nice to plot it if necessary

@Aylr
Copy link
Contributor

Aylr commented Dec 15, 2017

How is this better than what's already there? If the column is categorical you'll get that histogram.

@eamag
Copy link
Author

eamag commented Dec 17, 2017

countplot and histogram are different things, we may see different distributions in countplot

@conradoqg
Copy link
Contributor

I think what @eamag meant is that with a countplot you can compare side-by-side the distribution of different categorical fields, where the Y is the count. See https://seaborn.pydata.org/tutorial/categorical.html

One way to implement this feature is to generate a countplot on each categorical field against every other one. I'm sure this will hurt the performance and won't give meaningful value to the user.

In an exploratory process, usually, you need to choose rationally which categorical fields you want to compare (like the Titanic example in the above link).

Creating plots by comparing all categorical fields, like A vs B, B vs C, C vs A (2-fields) or A vs B vs C (3-fields) will create an exponential amount of plots (because it is a combinatory analysis).

In my opinion, we shouldn't implement this feature.

Best

@romainx
Copy link
Contributor

romainx commented Jan 6, 2018

Hello,

I still keep it open it could be studied.

@sbrugman sbrugman added feature request 💬 Requests for new features and removed enhancement labels May 29, 2019
@bensdm
Copy link

bensdm commented Jul 24, 2019

I agree this would be a great feature

I think what @eamag meant is that with a countplot you can compare side-by-side the distribution of different categorical fields, where the Y is the count. See https://seaborn.pydata.org/tutorial/categorical.html

One way to implement this feature is to generate a countplot on each categorical field against every other one. I'm sure this will hurt the performance and won't give meaningful value to the user.

In an exploratory process, usually, you need to choose rationally which categorical fields you want to compare (like the Titanic example in the above link).

Creating plots by comparing all categorical fields, like A vs B, B vs C, C vs A (2-fields) or A vs B vs C (3-fields) will create an exponential amount of plots (because it is a combinatory analysis).

In my opinion, we shouldn't implement this feature.

Best

I do not understand your point, is it really heavier to plot
image

instead of plotting ?

image

@github-actions
Copy link

Stale issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request 💬 Requests for new features
Projects
None yet
Development

No branches or pull requests

6 participants