Categories
Python Answers

How to use Python Pandas to merge multiple dataframes?

To use Python Pandas to merge multiple dataframes, we can call reduce and merge.

For instance, we write

import pandas as pd
from functools import reduce

df1 = pd.read_table('file1.csv', sep=',')
df2 = pd.read_table('file2.csv', sep=',')
df3 = pd.read_table('file3.csv', sep=',')

df_merged = reduce(lambda  left,right: pd.merge(left,right,on=['DATE'],
                                            how='outer'), data_frames)

to create 3 data frames from read_table.

And then we call reduce with a lambda to call pd.merge with the data frames left and right to merge by the DATE column values.

And we set how to 'outer' to do an outer join.

Categories
Python Answers

How to create an empty Python Pandas DataFrame, then filling it?

To create an empty Python Pandas DataFrame, then filling it, we can create a new data frame with the new items in it.

For instance, we write

data = []
for a, b, c in some_function_that_yields_data():
    data.append([a, b, c])

df = pd.DataFrame(data, columns=['A', 'B', 'C'])

to put all the data in the data list.

Then we use the DataFrame class to create the df data frame from the data and set the columns argument to a list of columns.

Categories
Python Answers

How to append multiple Python Pandas data frames at once?

To append multiple Python Pandas data frames at once, we can use the append method.

For instance, we write

import numpy as np
import pandas as pd

dates = np.asarray(pd.date_range('1/1/2000', periods=8))
df1 = pd.DataFrame(np.random.randn(8, 4), index=dates, columns=['A', 'B', 'C', 'D'])
df2 = df1.copy()
df3 = df1.copy()
df = df1.append([df2, df3])

to create 2 data frames df1 and df2.

And then we make copied of them with copy.

Finally, we call append with a list with df1 and df2 in it.

Categories
Python Answers

How to drop consecutive duplicates with Python Pandas?

To drop consecutive duplicates with Python Pandas, we can use shift.

For instance, we write

a.loc[a.shift(-1) != a]

to check if the last column isn’t equal the current one with a.shift(-1) != a.

And then we put a.shift(-1) != a in loc to return a data frame without the consecutive duplicate values.

Categories
Python Answers

How to convert number strings with commas in Python Pandas DataFrame to float?

To convert number strings with commas in Python Pandas DataFrame to float, we can use the astype method.

For instance, we write

df['colname'] = df['colname'].str.replace(',', '').astype(float)

to convert the values in the data frame df‘s colname column to a float by removing the commas from the strings with str.replace.

And then we call astype with float to convert the number strings with the commas removed to floats.