Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Master #6

Open
wants to merge 50 commits into
base: master
Choose a base branch
from
Open

Master #6

wants to merge 50 commits into from

Conversation

salmoni
Copy link
Contributor

@salmoni salmoni commented Dec 29, 2016

Hi Evgenii,

Here’s my first attempt at a pull request.

I’ve written routines for:

  • Moment of distribution
  • Coefficient of variation
  • Quantiles (x 9)
  • Skewness & Kurtosis (same file)
  • Standard error

All the best,

Alan

First commit of a routine to calculate the moment of a distribution.
The first commit of routines for both skewness and kurtosis. This needs
unit testing and proper documentation.
This is the first commit of a routine to calculate the standard error
for a sample. This needs testing.
This routine is to calculate the coefficient of variation. This needs
testing.
This is the first commit of 9 functions to calculate quantiles. All
work was taken from the Hyndman and Fan paper (1996) and a PDF of the
paper can be accessed at
https://www.amherst.edu/media/view/129116/original/Sample+Quantiles.pdf
All routines were sorting in the wrong order and thus producing the
incorrect quantile.

The first three quantile functions required some changes in terms of
how g was defined (to stop compiler warnings) and for correctness.
I’ve added comments about the background of each function. A link is
provided to the relevant page of R documentation which has blatantly
been copied here.

The routines all produce the same results to R with a test set of 50
data. This needs further testing to ensure accuracy but seems to be
reasonably close so far.
I put in a generic function caller which allows users to call any of
the quantile functions using the 3rd parameter (qtype). By default,
quantile 7 is selected which is the same model as used by R and S.
Quantile 8 is the one recommended by Hyndman and Fan (1996).
First commit of geometric mean function
First commit of harmonic mean function. Needs documentation to be
added.
First commit of effect sizes functions. There are two main functions
available:

1. effectSizeControl - to be used when a condition is compared against
a control condition
2. effectSize - to be used when two experimental conditions are
compared (i.e., neither is a control condition).
1. Adjusted spacing to 2 spaces (was 4)
2. Adjusted the routine to check the array being sent has content
@evgenyneu
Copy link
Owner

Hi @salmoni, thanks for the changes. Sorry, I am too slow, still working on the previous functions that you submitted. I have added skewnessA, skewnessB and centralMoment so far in this branch: https://github.com/evgenyneu/SigmaSwiftStatistics/tree/salmoni-master

I will let you know when I finish with the first bunch. Thanks!

@salmoni
Copy link
Contributor Author

salmoni commented Jan 14, 2017 via email

Support functions have been made ‘internal static’ functions.
A short routine to extract all the unique values that occur in an
array.
The first commit of a routine that extracts the unique values that
occur in an array and returns them along with the frequency that each
value occurs. This is useful in ranking and nonparametric statistics.
This routine ranks a vector. Tied ranks can be given the mean, minimum,
maximum, first or last ranks. See the description of ranking for R for
more details.

https://stat.ethz.ch/R-manual/R-devel/library/base/html/rank.html
Unit tests for the quantiles routines. These probably need more work.
Function names were changed so as not to begin with an upper case
letter. Functions were also made public static functions.
A function to calculate the mode of an array. This identifies not just
the maximum value but also returns the indices where this value occurs.
Probably more a glitch or my exploring Github that’s doing this…
Or me - unsure which
Added some comments to explain why these variables are defined
Changed unit test names to make them more descriptive of what types of
things they test
@evgenyneu
Copy link
Owner

evgenyneu commented Jan 22, 2017

Hi @salmoni. I have finished working on the first batch of functions that you submitted, pushed to master and released a new version. Please let me know if you find any typos or other problems with those functions.

  • Moment of distribution
  • Coefficient of variation
  • Quantiles (x 9)
  • Skewness
  • Kurtosis
  • Standard error

I have not looked at the other commits that you pushed to this branch yet, starting with "generic function caller", "geometric mean" etc. Since I made a lot of changes to your code, could you please create a new pull request on top of master with those additions? It would be easier if you could create several pull requests: a separate pull request for each new function, if that's possible. This way it will be faster for me to release the functions, one by one.

Thank you so much for your contribution! I have added your name to the readme if you don't mind.

Unit test code should be improved. Proper failures are thrown when the
tests fail and the code is clearer than previously.
The first commit of some unit tests for using the uniqueValues
function.
Sloppy coding first time around. I moved the declarations to a place
where they are not declared if an optional ‘nil’ value is returned
The first commit for some unit tests for the frequencies function.

Included are:
* Empty array
* Array with a single (negative) element
* Array with one value multiple times
* All positive elements
* All negative elements
* Both positive and negative elements

There are many other use cases I’ve not thought of: Contributions are
welcome!
Removed all ‘var’ declarations and replaced with ‘let’ reducing code
length and increasing clarity.
Removed all ‘var’ declarations and replaced with ‘let’ reducing code
length and increasing clarity.
This is the first commit for unit tests for the moment function.
Currently, only a single data set is analysed for moments 0 - 4
inclusive, and a test for an empty array, and an array with a single
value. Test results were obtained using SciPy (stats.moment) as a
reference.
This is the first commit of the unit tests for the skewness and
kurtosis functions.

The first 2 test using a normal array for both functions. The next two
test the functions with an empty array (do they return ‘nil’
correctly?) and the third two analyse an array with a single element.

They probably need extending with other data sets.
This is the first commit of the unit tests for the geometric and
harmonic means. These tests only test using a fairly normal array of
doubles, an empty array, and an array with a single element.

scipy.stats.gmean and scipy.stats.hmean were used for reference
results.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants