Recommended Posts
- Get link
- X
- Other Apps
Pandas: Powerful Data Analysis in Python
Pandas is a Python library specifically designed for working with structured data – think of data organized in tables, like spreadsheets or SQL databases. It makes data manipulation and analysis much easier and more intuitive.
Key Features & Why it's Useful:
- DataFrame: This is the core data structure in Pandas. A DataFrame is like a table with rows and columns, where each column can hold different types of data (numbers, strings, dates, etc.).
- Series: A Series is a one-dimensional labeled array, essentially a single column of a DataFrame.
- Data Cleaning & Preparation: Pandas provides tools to handle missing data, filter data, transform data types, and generally clean up messy datasets.
- Data Manipulation: You can easily select, filter, group, merge, and reshape data in DataFrames.
- Data Analysis: Pandas offers functions for calculating statistics (mean, median, standard deviation, etc.), finding correlations, and performing other data analysis tasks.
- Input/Output: Pandas can read data from and write data to various file formats (CSV, Excel, SQL databases, JSON, etc.).
- Integration with other libraries: Pandas works seamlessly with NumPy, Matplotlib, and other data science libraries.
import pandas as pd
# Create a DataFrame from a dictionary
data = {'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 28],
'City': ['New York', 'London', 'Paris']}
df = pd.DataFrame(data)
print(df)
# Output:
# Name Age City
# 0 Alice 25 New York
# 1 Bob 30 London
# 2 Charlie 28 Paris
# Select a column
print(df['Name'])
# Output:
# 0 Alice
# 1 Bob
# 2 Charlie
# Name: Name, dtype: object
# Calculate the average age
average_age = df['Age'].mean()
print(average_age) # Output: 27.666666666666668
Simple Example:
Data Loading, Cleaning, and Basic Analysis with Pandas
Comments
Post a Comment