... ArdiLand Institute of Technology Pandas – The Python Library for Data Analysis & Manipulation | Ardiland Institute of Technology
540-440-1540‬
info@ardiland.com
USD ($)
$
United States Dollar
Br
Ethiopian Birr

Pandas – The Python Library for Data Analysis & Manipulation

Created by Adugna Asrat in Quick Notes 2 Apr 2025
Share

💡 What Is Pandas?

Pandas is a Python library used to work with structured data like tables (rows and columns), spreadsheets, and CSV files.

 ✅ Built for fast, flexible, and powerful data analysis
✅ Can handle missing data, filtering, sorting, and transformations
✅ Works seamlessly with NumPy, Matplotlib, Seaborn, and Scikit-learn


🧱 1. Key Data Structures in Pandas

Structure

Description

Series

A single column of data (like a list)

DataFrame

A table with rows and columns (like Excel)

Example: A DataFrame can represent a table of students:

Name

Age

Score

Maya

21

88

Christian

22

91


📥 2. Loading Data into Pandas

Pandas allows you to read many data formats:

 ✅ CSV files
✅ Excel files
✅ JSON
✅ SQL
✅ Online datasets (URLs)

Typical use:

import pandas as pd

df = pd.read_csv("students.csv")

Now df is a DataFrame holding the entire dataset.


🔍 3. Exploring the Dataset

Once the data is loaded, Pandas provides tools to explore it:

 ✅ df.head() – View first 5 rows
df.info() – See structure and data types
df.describe() – Summary statistics
df.columns – List column names
df.shape – Show number of rows and columns


🧹 4. Data Cleaning with Pandas

Real-world data is often messy. Pandas helps clean it:

Task

Method

Remove missing values

df.dropna()

Fill missing values

df.fillna(value)

Rename columns

df.rename()

Change data types

df.astype()

Filter rows by condition

df[df['Score'] > 85]

✅ Data cleaning is essential before analysis or modeling.


🧮 5. Data Analysis Tasks

With Pandas, you can:

 ✅ Group and summarize:

df.groupby('Gender')['Score'].mean()

✅ Sort data:

df.sort_values('Score', ascending=False)

✅ Create new columns:

df['Grade'] = df['Score'] >= 90

✅ Merge datasets:

pd.merge(df1, df2, on='StudentID')

✅ Handle time-series:

df['Date'] = pd.to_datetime(df['Date'])


📊 6. Visualization with Pandas

Pandas integrates with Matplotlib and Seaborn:

 ✅ Line chart:

df.plot(x='Date', y='Sales')

✅ Bar chart:

df['Score'].value_counts().plot(kind='bar')

✅ Histogram:

df['Age'].plot.hist()

Visualization helps you see trends and patterns in data.


🧠 7. Why Pandas Matters for Ethiopians in Tech

 ✅ Used in data projects for finance, education, health, NGOs
✅ Essential in roles like Data Analyst, AI/ML Engineer, and  Business Intelligence
✅ Helps students analyze real Ethiopian datasets (grade records,surveys, budgeting, etc.)
✅ Works well with Jupyter Notebooks for data science learning


💼 Careers That Use Pandas

✅ Data Analyst
✅ Data Scientist
✅ ML Engineer
✅ Business Intelligence Officer
✅ Statistician
✅ Finance Analyst
✅ Monitoring & Evaluation Officer

Comments (0)

Share

Share this post with others