Categories
Python Answers

How to use a list of values to select rows from a Python Pandas dataframe?

To use a list of values to select rows from a Python Pandas dataframe, we call the isin method.

For instanmce, we write

df = pd.DataFrame({'A': [5,6,3,4], 'B': [1,2,3,5]})
r = df[df['A'].isin([3, 6])]

to create the df data frame and get the values from column 'A' that’s in rows 3 to 6 with isin.

We can also get the rows that aren’t in 3 to 6 with

df[~df['A'].isin([3, 6])]
Categories
Python Answers

How to drop rows of a Python Pandas DataFrame whose value in a certain column is NaN?

To drop rows of a Python Pandas DataFrame whose value in a certain column is NaN, we call the notna method.

For instance, we write

df = df[df['EPS'].notna()]

to drop the rows where the 'EPS' column isn’t NaN by calling notna on the column.

Categories
Python Answers

How to add a new column to an existing Python Pandas DataFrame?

To add a new column to an existing Python Pandas DataFram, we can the assign method.

For instance, we write

df1 = df1.assign(e=pd.Series(np.random.randn(sLength)).values)

to call assign on the df1 data frame.

We add values to it by setting e to pd.Series(np.random.randn(sLength)).values.

Categories
Python Answers

How to change column type in Python Pandas?

To change column type in Python Pandas, we can use the to_numeric function.

For instance, we write

s = pd.Series(["8", 6, "7.5", 3, "0.9"])
pd.to_numeric(s)

to call to_numeric on the s series to convert all entries in the series from strings to numbers.

Categories
Python Answers

How to create a Python Pandas Dataframe by appending one row at a time?

To create a Python Pandas Dataframe by appending one row at a time, we use a for loop and add the entries to the loc dictionary.

For instance, we wtite

import pandas as pd
from numpy.random import randint

df = pd.DataFrame(columns=['lib', 'qty1', 'qty2'])
for i in range(5):
    df.loc[i] = ['name' + str(i)] + list(randint(10, size=2))

to create the df dataframe with pd.DataFrame with a few columns.

Then we add the rows by adding them to df.loc by using

df.loc[i] = ['name' + str(i)] + 

where list(randint(10, size=2)) is the new data.