Categories
Python Answers

How to add Python Pandas data to an existing csv file?

Sometimes, we want to add Python Pandas data to an existing csv file.

In this article, we’ll look at how to add Python Pandas data to an existing csv file.

How to add Python Pandas data to an existing csv file?

To add Python Pandas data to an existing csv file, we can write the data to a csv with to_csv.

For instance, we write

df.to_csv('my_csv.csv', mode='a', header=False)

to call to_csv with the CSV file path.

The mode is set to 'a' for append.

And header is set to false so the header row won’t be appended.

Conclusion

To add Python Pandas data to an existing csv file, we can write the data to a csv with to_csv.

Categories
Python Answers

How to select with complex criteria from a Python Pandas DataFrame?

Sometimes, we want to select with complex criteria from a Python Pandas DataFrame.

In this article, we’ll look at how to select with complex criteria from a Python Pandas DataFrame.

How to select with complex criteria from a Python Pandas DataFrame?

To select with complex criteria from a Python Pandas DataFrame, we can call the query method.

For instance, we write

import pandas as pd

from random import randint
df = pd.DataFrame({'A': [randint(1, 9) for x in xrange(10)],
                   'B': [randint(1, 9) * 10 for x in xrange(10)],
                   'C': [randint(1, 9) * 100 for x in xrange(10)]})
df.query('B > 50 and C != 900')

to create a data frame df with pd.DataFrame.

Then we call df.query with a string that has the conditions of the values we’re looking for and return that as a data frame.

Conclusion

To select with complex criteria from a Python Pandas DataFrame, we can call the query method.

Categories
Python Answers

How to convert an XML file to a Python Pandas dataframe?

To convert an XML file to a Python Pandas dataframe, we parse the XML into an object and them we create a dataframe from it.

For instance, we write

import pandas as pd
import xml.etree.ElementTree as ET

xml_str = '<?xml version="1.0" encoding="utf-8"?>\n<response>\n <head>\n  <code>\n   200\n  </code>\n </head>\n <body>\n  <data id="0" name="All Categories" t="2018052600" tg="1" type="category"/>\n  <data id="13" name="RealEstate.com.au [H]" t="2018052600" tg="1" type="publication"/>\n </body>\n</response>'

etree = ET.fromstring(xml_str)
dfcols = ['id', 'name']
df = pd.DataFrame(columns=dfcols)

for i in etree.iter(tag='data'):
    df = df.append(
        pd.Series([i.get('id'), i.get('name')], index=dfcols),
        ignore_index=True)

df.head()

to call ET.fromstring with xml_str to create an XML tree object.

And then we create an empty data frame with some columns with DataFrame.

Next, we use a for loop to loop through the data tag values.

In it, we call df.append to append the series created from the id and name attribute values of each node.

Categories
Python Answers

How to create scatter by category plots in Python Pandas and Pyplot?

To create scatter by category plots in Python Pandas and Pyplot, we can use the subplots method to do the plots.

For instance, we write

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
np.random.seed(1974)

num = 20
x, y = np.random.random((2, num))
labels = np.random.choice(['a', 'b', 'c'], num)
df = pd.DataFrame(dict(x=x, y=y, label=labels))

groups = df.groupby('label')

# Plot
fig, ax = plt.subplots()
ax.margins(0.05)
for name, group in groups:
    ax.plot(group.x, group.y, marker='o', linestyle='', ms=12, label=name)
ax.legend()

plt.show()

to call np.random.random to create some random data.

And then we convert it to a data frame with DataFrame.

Next, we call plt.subplots to crearte the plot.

Then we loop through the groups we got from groupby and call ax.plot to plot the values.

Next, we call ax.legeng to add a legend.

And then we call plt.show to show the plot.

Categories
Python Answers

How to use GroupBy with a Python Pandas DataFrame and select most common value?

To use GroupBy with a Python Pandas DataFrame and select most common value, we can use the pd.Series.mode aggregation.

For instance, we write

source.groupby(['Country','City'])['Short name'].agg(pd.Series.mode)

to call groupby on the source data frame.

And then we get the mode of the 'Short name' column values by calling agg with pd.Series.Mode.

We can convert thge returned result into a dataframe with the to_frame method.

For instance, we can write

source.groupby(['Country','City'])['Short name'].agg(pd.Series.mode).to_frame()

to call to_frame on the result to convert it to a data frame.