scikit_na.altair
- scikit_na.altair.plot_corr(data: DataFrame, columns: Sequence[str] | None = None, mask_diag: bool = True, annot_color: str = 'black', round_sgn: int = 2, font_size: int = 14, opacity: float = 0.5, corr_kws: dict = None, chart_kws: dict = None, x_kws: dict = None, y_kws: dict = None, color_kws: dict = None, text_kws: dict = None) Chart
Correlation heatmap.
- Parameters:
data (DataFrame) – Input data.
columns (Optional[Sequence[str]]) – Columns names.
mask_diag (bool = True) – Mask diagonal on heatmap.
corr_kws (dict, optional) – Keyword arguments passed to
pandas.DataFrame.corr()
method.heat_kws (dict, optional) – Keyword arguments passed to
seaborn.heatmap()
method.
- Returns:
Altair Chart object.
- Return type:
altair.Chart
- scikit_na.altair.plot_hist(data: DataFrame, col: str, col_na: str, na_label: str = None, na_replace: dict = None, heuristic: bool = True, thres_uniq: int = 20, step: bool = False, norm: bool = True, font_size: int = 14, xlabel: str = None, ylabel: str = 'Frequency', chart_kws: dict = None, markarea_kws: dict = None, markbar_kws: dict = None, joinagg_kws: dict = None, calc_kws: dict = None, x_kws: dict = None, y_kws: dict = None, color_kws: dict = None) Chart
Histogram plot.
Plots a histogram of values in a column col grouped by NA/non-NA values in column col_na.
- Parameters:
data (DataFrame) – Input data.
col (str) – Column to display distribution of values.
col_na (str) – Column to group values by.
na_label (str, optional) – Legend title.
na_replace (dict, optional) – Dictionary to replace values returned by
pandas.Series.isna()
method.step (bool, optional) – Draw step plot.
norm (bool, optional) – Normalize values in groups.
xlabel (str, optional) – X axis label.
ylabel (str, optional) – Y axis label.
chart_kws (dict, optional) – Keyword arguments passed to
altair.Chart()
.markarea_kws (dict, optional) – Keyword arguments passed to
altair.Chart.mark_area()
.markbar_kws (dict, optional) – Keyword arguments passed to
altair.Chart.mark_bar()
.joinagg_kws (dict, optional) – Keyword arguments passed to
altair.Chart.transform_joinaggregate()
.calc_kws (dict, optional) – Keyword arguments passed to
altair.Chart.transform_calculate()
.x_kws (dict, optional) – Keyword arguments passed to
altair.X()
.y_kws (dict, optional) – Keyword arguments passed to
altair.Y()
.color_kws (dict, optional) – Keyword arguments passed to
altair.Color()
.
- Returns:
Altair Chart object.
- Return type:
Chart
- scikit_na.altair.plot_kde(data: DataFrame, col: str, col_na: str, na_label: str = None, na_replace: dict = None, font_size: int = 14, xlabel: str = None, ylabel: str = 'Density', chart_kws: dict = None, markarea_kws: dict = None, density_kws: dict = None, x_kws: dict = None, y_kws: dict = None, color_kws: dict = None) Chart
Density plot.
Plots distribution of values in a column col grouped by NA/non-NA values in column col_na.
- Parameters:
data (DataFrame) – Input data.
col (str) – Column to display distribution of values.
col_na (str) – Column to group values by.
na_label (str, optional) – Legend title.
na_replace (dict, optional) – Dictionary to replace values returned by
pandas.Series.isna()
method.xlabel (str, optional) – X axis label.
ylabel (str, optional) – Y axis label.
chart_kws (dict, optional) – Keyword arguments passed to
altair.Chart()
.markarea_kws (dict, optional) – Keyword arguments passed to
altair.Chart.mark_area()
.density_kws (dict, optional) – Keyword arguments passed to
altair.Chart.transform_density()
.x_kws (dict, optional) – Keyword arguments passed to
altair.X()
.y_kws (dict, optional) – Keyword arguments passed to
altair.Y()
.color_kws (dict, optional) – Keyword arguments passed to
altair.Color()
.
- Returns:
Altair Chart object.
- Return type:
Chart
- scikit_na.altair.plot_heatmap(data: DataFrame, columns: Sequence[str] | None = None, names: list = None, sort: bool = True, droppable: bool = True, font_size: int = 14, xlabel: str = 'Columns', ylabel: str = 'Rows', zlabel: str = 'Values', chart_kws: dict = None, rect_kws: dict = None, x_kws: dict = None, y_kws: dict = None, color_kws: dict = None) Chart
Heatmap plot for NA/non-NA values.
By default, it also indicates values that are to be dropped by
pandas.DataFrame.dropna()
method.- Parameters:
data (DataFrame) – Input data.
columns (Optional[Sequence[str]], optional) – Columns that are to be displayed on a plot.
names (list, optional) – Values labels passed as a list. The first element corresponds to non-missing values, the second one to NA values, and the last one to droppable values, i.e. values to be dropped by
pandas.DataFrame.dropna()
.sort (bool, optional) – Sort values as NA/non-NA.
droppable (bool, optional) – Show values to be dropped by
pandas.DataFrame.dropna()
method.xlabel (str, optional) – X axis label.
ylabel (str, optional) – Y axis label.
zlabel (str, optional) – Groups label (shown as a legend title).
chart_kws (dict, optional) – Keyword arguments passed to
altair.Chart()
class.rect_kws (dict, optional) – Keyword arguments passed to
altair.Chart.mark_rect()
method.x_kws (dict, optional) – Keyword arguments passed to
altair.X()
class.y_kws (dict, optional) – Keyword arguments passed to
altair.Y()
class.color_kws (dict, optional) – Keyword arguments passed to
altair.Color()
class.
- Returns:
Altair Chart object.
- Return type:
altair.Chart
- scikit_na.altair.plot_scatter(data: DataFrame, x_col: str, y_col: str, col_na: str, na_label: str = None, na_replace: dict = None, font_size: int = 14, xlabel: str = None, ylabel: str = None, circle_kws: dict = None, color_kws: dict = None, x_kws: dict = None, y_kws: dict = None)
Scatter plot.
- Parameters:
data (DataFrame) – Input data.
x_col (str) – Column name corresponding to X axis.
y_col (str) – Column name corresponding to Y axis.
col_na (str) – Column name
na_label (str, optional) – Label for NA values in legend.
na_replace (dict, optional) – NA replacement mapping, by default {True: ‘NA’, False: ‘Filled’}.
font_size (int, optional) – Font size for plotting, by default 14.
xlabel (str, optional) – X axis label.
ylabel (str, optional) – Y axis label.
circle_kws (dict, optional) – Keyword arguments passed to
altair.Chart.mark_circle()
.color_kws (dict, optional) – Keyword arguments passed to
altair.Color()
.x_kws (dict, optional) – Keyword arguments passed to
altair.X()
.y_kws (dict, optional) – Keyword arguments passed to
altair.Y()
.
- Returns:
Scatter plot.
- Return type:
altair.Chart
- scikit_na.altair.plot_stairs(data: DataFrame, columns: Sequence[str] | None = None, xlabel: str = 'Columns', ylabel: str = 'Instances', tooltip_label: str = 'Size difference', dataset_label: str = '(Whole dataset)', font_size: int = 14, area_kws: dict = None, chart_kws: dict = None, x_kws: dict = None, y_kws: dict = None)
Stairs plot.
Plots changes in dataset size (rows/instances number) after applying
pandas.DataFrame.dropna()
to each column cumulatively.Columns are sorted by maximum influence on dataset size.
- Parameters:
data (DataFrame) – Input data.
columns (Optional[Sequence[str]], optional) – Columns that are to be displayed on a plot.
xlabel (str, optional) – X axis label.
ylabel (str, optional) – Y axis label.
tooltip_label (str, optional) – Label for differences in dataset size that is displayed on a tooltip.
dataset_label (str, optional) – Label for the whole dataset (before dropping any NAs).
area_kws (dict, optional) – Keyword arguments passed to
altair.Chart.mark_area()
method.chart_kws (dict, optional) – Keyword arguments passed to
altair.Chart()
class.x_kws (dict, optional) – Keyword arguments passed to
altair.X()
class.y_kws (dict, optional) – Keyword arguments passed to
altair.Y()
class.
- Returns:
Chart object.
- Return type:
altair.Chart