Add "target" variable to ProfileReport and then add more graphs #73

eamag · 2017-11-27T15:58:17Z

https://seaborn.pydata.org/generated/seaborn.countplot.html is very useful if target is categorical. so it'd be nice to plot it if necessary

Aylr · 2017-12-15T00:38:49Z

How is this better than what's already there? If the column is categorical you'll get that histogram.

eamag · 2017-12-17T09:27:00Z

countplot and histogram are different things, we may see different distributions in countplot

conradoqg · 2018-01-05T02:13:30Z

I think what @eamag meant is that with a countplot you can compare side-by-side the distribution of different categorical fields, where the Y is the count. See https://seaborn.pydata.org/tutorial/categorical.html

One way to implement this feature is to generate a countplot on each categorical field against every other one. I'm sure this will hurt the performance and won't give meaningful value to the user.

In an exploratory process, usually, you need to choose rationally which categorical fields you want to compare (like the Titanic example in the above link).

Creating plots by comparing all categorical fields, like A vs B, B vs C, C vs A (2-fields) or A vs B vs C (3-fields) will create an exponential amount of plots (because it is a combinatory analysis).

In my opinion, we shouldn't implement this feature.

Best

romainx · 2018-01-06T15:40:49Z

Hello,

I still keep it open it could be studied.

bensdm · 2019-07-24T21:38:15Z

I agree this would be a great feature

I think what @eamag meant is that with a countplot you can compare side-by-side the distribution of different categorical fields, where the Y is the count. See https://seaborn.pydata.org/tutorial/categorical.html

One way to implement this feature is to generate a countplot on each categorical field against every other one. I'm sure this will hurt the performance and won't give meaningful value to the user.

In an exploratory process, usually, you need to choose rationally which categorical fields you want to compare (like the Titanic example in the above link).

Creating plots by comparing all categorical fields, like A vs B, B vs C, C vs A (2-fields) or A vs B vs C (3-fields) will create an exponential amount of plots (because it is a combinatory analysis).

In my opinion, we shouldn't implement this feature.

Best

I do not understand your point, is it really heavier to plot

instead of plotting ?

github-actions · 2020-02-16T00:01:22Z

Stale issue

romainx added the enhancement label Jan 6, 2018

sbrugman added feature request 💬 Requests for new features and removed enhancement labels May 29, 2019

github-actions bot added the no-issue-activity label Feb 16, 2020

github-actions bot closed this as completed Feb 23, 2020

sbrugman mentioned this issue Jul 11, 2020

Support Label Columns for Feature Investigation #513

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add "target" variable to ProfileReport and then add more graphs #73

Add "target" variable to ProfileReport and then add more graphs #73

eamag commented Nov 27, 2017

Aylr commented Dec 15, 2017

eamag commented Dec 17, 2017

conradoqg commented Jan 5, 2018

romainx commented Jan 6, 2018

bensdm commented Jul 24, 2019

github-actions bot commented Feb 16, 2020

Add "target" variable to ProfileReport and then add more graphs #73

Add "target" variable to ProfileReport and then add more graphs #73

Comments

eamag commented Nov 27, 2017

Aylr commented Dec 15, 2017

eamag commented Dec 17, 2017

conradoqg commented Jan 5, 2018

romainx commented Jan 6, 2018

bensdm commented Jul 24, 2019

github-actions bot commented Feb 16, 2020