Describe function of float data type

Hi, my “duration” data type is a float

When I tried to describe it,


this appears:

count    0.0
mean     NaN
std      NaN
min      NaN
25%      NaN
50%      NaN
75%      NaN
max      NaN
Name: duration, dtype: float64

What is the reason for this?
Tried to search forums all over the web but couldn’t find anything.

I’ve edited your post for readability. When you enter a code block into a forum post, please precede it with a separate line of three backticks and follow it with a separate line of three backticks to make it easier to read.

You can also use the “preformatted text” tool in the editor (</>) to add backticks around text.

See this post to find the backtick on your keyboard.
Note: Backticks (`) are not single quotes (’).

I’ve run into this issue before and I’m not 100% sure why this happens.
My best guess is that when Pandas reads in the data, it guesses which type the data is in each column. This means it can guess wrong and try to coerce data into a type that it’s not going to work with.

my solution to this has been to create a dictionary with column names and the types I want them to be something that looks like this

data_types = { 'row_id': 'int32', 'timestamp': 'int64', 'user_id': 'int64', 'content_id': 'int16', 'content_type_id': 'int8', 'task_container_id': 'int16', 'user_answer': 'int8', 'answered_correctly': 'int8', 'prior_question_elapsed_time': 'float32', 'prior_question_had_explanation': 'boolean' } 
df = pd.read_csv('../input/riiid-test-answer-prediction/train.csv', dtype=data_types)

so if your duration is a timestamp, you may want to look at this stack overflow post: python - datetime dtypes in pandas read_csv - Stack Overflow
you could also read it in as a ‘str’ type.
Hope this helps.

thank you for the tip. my duration is not a datetime, but the solution is helpful!

thanks for helping to edit, will note this!

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.