Stat module for DataFrame, providing basic statistical metrics for numeric columns.
df
DataFrame An instance of DataFrame.
Compute the sum of a numeric column.
columnName
String The column to evaluate, containing Numbers.
df.stat.sum('column1')
Returns Number The sum of the column.
Compute the maximal value into a numeric column.
columnName
String The column to evaluate, containing Numbers.
df.stat.max('column1')
Returns Number The maximal value into the column.
Compute the minimal value into a numeric column.
columnName
String The column to evaluate, containing Numbers.
df.stat.min('column1')
Returns Number The minimal value into the column.
Compute the mean value into a numeric column.
columnName
String The column to evaluate,isNumber(n.get(columnName)) ? p + Number( containing Numbers.
df.stat.mean('column1')
Returns Number The mean value into the column.
Compute the mean value into a numeric column. Alias from mean.
columnName
String The column to evaluate, containing Numbers.
df.stat.min('column1')
Returns Number The mean value into the column.
Compute the variance into a numeric column.
columnName
String The column to evaluate, containing Numbers.population
Boolean Population mode. If true, provide the population variance, not the sample one. (optional, defaultfalse
)
df.stat.var('column1')
Returns Number The variance into the column.
Compute the standard deviation into a numeric column.
columnName
String The column to evaluate, containing Numbers.population
Boolean Population mode. If true, provide the population standard deviation, not the sample one. (optional, defaultfalse
)
df.stat.sd('column1')
Returns Number The standard deviation into the column.
Compute all the stats available with the Stat module on a numeric column.
columnName
String The column to evaluate, containing Numbers.
df.stat.stats('column1')
Returns Object An dictionnary containing all statistical metrics available.