Skip to content

Is there a way to represent MISSING VALUE in NimData? #27

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
oskca opened this issue Jun 3, 2018 · 5 comments
Open

Is there a way to represent MISSING VALUE in NimData? #27

oskca opened this issue Jun 3, 2018 · 5 comments

Comments

@oskca
Copy link
Contributor

oskca commented Jun 3, 2018

missing float can be represented by NaN
then how can int and string type to be expressed as MISSING? using SPACE or ZERO maybe not a good idea? then is there a general way to express the MISSING VALUE in NimData as in Pandas?

@bluenote10
Copy link
Owner

Strings should be possible via nil.

In general, you can always go for Option[T] if need be. Without wrapping, there is currently no way for integer types. Note that Pandas also does not support this, because it secretly converts a column from ints to floats as soon as there are missing values. I was considering introducing types like OptInt which encode the missing information as low(int) and provide convenience functions like x.isMissing.

@oskca
Copy link
Contributor Author

oskca commented Jun 3, 2018

That's would be good to have OptInt, when would you like to do it?
If it's possible I would like to help a little bit 😄

@moigagoo
Copy link

moigagoo commented Sep 14, 2018

In general, you can always go for Option[T] if need be.

How would I do that with DateCol? I have a data set with columns of type DateCol but a lot of the values are missing. When I run my app, stdout gets flooded with warnings, which slows down the entire process to the point of unusability.

How do I tell NimData that it's ok to have empty values in this column?

@bluenote10
Copy link
Owner

Optional value handling isn't implemented yet, so currently you would have to keep the field as a string if you want to use the schema parser macro for other fields and use your custom string-to-Option[Time] parser on that string field as a second step.

It's probably also not difficult to add it to the schema parser macro. I have created a small draft in the branch feature/opt_date_parsing. I currently don't have the time to make it working properly (how to use bindsym on Option[Time] ?), but if you want to look into it, this should get you started.

@skanskan
Copy link

skanskan commented Dec 6, 2018

NaN and Missings are different things. The former is a wrong number for example infinity, the latter is just something you don't know its value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants