Data analysis
Here is a clear, interview-focused list of IMPORTANT Pandas library functions & methods for Data Analysis and Data Cleaning.
These are the exact ones interviewers expect you to know — not everything.
You can learn + revise this in 1–2 hours.
---
πΌ PANDAS IMPORTANT FUNCTIONS & METHODS (INTERVIEW READY)
---
✅ 1. Reading & Writing Data (VERY IMPORTANT)
pd.read_csv()
pd.read_excel()
pd.read_json()
df.to_csv()
df.to_excel()
Use: Load and save datasets
Interview Tip: Almost every data project starts with read_csv().
---
✅ 2. Basic Data Inspection (FIRST STEP IN ANALYSIS)
df.head()
df.tail()
df.info()
df.shape
df.columns
df.dtypes
Use:
Understand data structure
Check rows, columns, data types
---
✅ 3. Summary & Statistics (VERY COMMON QUESTIONS)
df.describe()
df.mean()
df.median()
df.mode()
df.min()
df.max()
df.std()
Use:
Understand distribution
Detect anomalies
---
✅ 4. Handling Missing Values (EXTREMELY IMPORTANT)
df.isnull()
df.isnull().sum()
df.notnull()
df.dropna()
df.fillna()
Examples:
df.dropna()
df['age'].fillna(df['age'].mean(), inplace=True)
Interview Tip: Always explain why you choose mean/median.
---
✅ 5. Removing Duplicates
df.duplicated()
df.drop_duplicates()
---
✅ 6. Selecting Data (FILTERING)
df['column']
df[['col1', 'col2']]
df.loc[]
df.iloc[]
Examples:
df.loc[df['age'] > 25]
df.iloc[0:5]
---
✅ 7. Sorting Data
df.sort_values()
df.sort_index()
Example:
df.sort_values(by='salary', ascending=False)
---
✅ 8. Renaming Columns
df.rename()
df.columns = []
---
✅ 9. Data Type Conversion
df.astype()
pd.to_datetime()
Example:
df['date'] = pd.to_datetime(df['date'])
---
✅ 10. Grouping & Aggregation (VERY IMPORTANT)
df.groupby()
Example:
df.groupby('department')['salary'].mean()
---
✅ 11. Applying Functions
df.apply()
df.map()
df.applymap()
---
✅ 12. Merging & Joining Data
pd.merge()
df.join()
pd.concat()
Example:
pd.merge(df1, df2, on='id', how='inner')
---
✅ 13. String Operations (VERY USEFUL)
df['name'].str.lower()
df['email'].str.contains()
df['text'].str.replace()
---
✅ 14. Index Operations
df.set_index()
df.reset_index()
---
✅ 15. Value Counts (VERY COMMON)
df['column'].value_counts()
---
⭐ MOST IMPORTANT FOR INTERVIEWS (MUST REMEMBER)
If you remember ONLY these, it’s enough:
read_csv
head
info
describe
isnull
fillna
dropna
drop_duplicates
groupby
sort_values
merge
value_counts
---
π― 1-Minute Interview Answer (MEMORIZE THIS)
> “In Pandas, I usually start with read_csv() to load data, use head(), info() and describe() for understanding the dataset. For cleaning, I use isnull(), fillna(), and drop_duplicates(). For analysis, I use groupby(), sorting, and aggregation functions. Finally, I export clean data using to_csv().”
This answer sounds confident and professional.
---
π₯ If you want next:
Pandas interview questions
Mini Pandas project
Cheat-sheet PDF
SQL vs Pandas comparison
Just tell me π
Comments
Post a Comment