PIQ : 1-50
🔹 Basic Level
What is Pandas in Python? How is it different from NumPy? Follow-up: When would you use Pandas over NumPy?
What are the two primary data structures in Pandas? (Expected Answer:
SeriesandDataFrame)How do you handle missing data in a DataFrame? (Follow-up: Explain the difference between
dropna(),fillna(), andisna())How do you select a subset of rows and columns from a DataFrame? (Explain
.loc[],.iloc[], and conditional filtering)What is the difference between
apply(),map(), andapplymap()in Pandas?
🔹 Intermediate Level
How can you merge or join two DataFrames in Pandas? (Expected concepts:
merge(),join(),concat()and keys/index alignment)What is the difference between
groupby()andpivot_table()? When would you use each?How do you reset the index of a DataFrame? What’s the use case for
reset_index()?How would you identify and remove duplicate rows in a DataFrame?
🔹 Advanced Level
Explain the use of
categoricaldata in Pandas. Why and when should you usepd.Categorical?
🔸 Data Manipulation & Cleaning
How do you rename columns or indexes in a Pandas DataFrame?
How do you filter rows based on multiple conditions in Pandas?
How do you replace values in a column based on a condition?
What is the difference between
.copy()and assignment (=) in Pandas? Why is it important?How do you detect and convert data types (e.g., from object to numeric)?
What is
astype()used for in Pandas? Give an example.How do you create a new column from existing columns in a DataFrame?
🔸 Grouping, Aggregation & Window Functions
What is the role of
agg()in a groupby operation? How is it different fromapply()?Explain rolling window operations. How would you calculate a moving average?
How do you perform cumulative operations like cumulative sum or product in Pandas?
What does
transform()do in the context of groupby operations? How is it different fromagg()?What is the difference between
nunique()andunique()?
🔸 Merging, Joining & Reshaping
How is
concat()different frommerge()in Pandas? Give use cases for each.Explain
melt()andpivot()functions. When would you use each?What is a multi-index in Pandas? How do you create and access it?
How do you sort a DataFrame by multiple columns or index levels?
🔸 Performance & Efficiency
How can you improve performance when dealing with large Pandas DataFrames? (Expected: chunking, dtypes optimization, vectorization)
What are the implications of using
inplace=True? Should it be avoided? Why or why not?How do you profile a DataFrame to get basic statistics (like null counts, datatypes, etc.)?
What are alternatives to Pandas when the dataset is too large to fit in memory?
Here are 20 more Pandas interview questions focused on time series, file I/O, indexing, debugging, and real-world use cases:
🔹 Time Series & Date Handling
How do you convert a column to datetime in Pandas? What common issues arise?
How do you set a datetime column as the index of a DataFrame? Why might you do that?
What is the purpose of
resample()in Pandas? Give an example.How do you find differences between timestamps or calculate duration in Pandas?
What is the difference between
resample()androlling()for time-based data?
🔹 File Handling and I/O
How do you read a CSV file using Pandas with custom delimiters and column types?
What options are available when writing a DataFrame to a CSV or Excel file?
How can you read only a specific number of rows or skip rows from a file?
How do you handle large CSVs that can’t fit in memory using Pandas?
How can you read data from a SQL database into a Pandas DataFrame?
🔹 Indexing & Selection
What is the difference between
.ix,.loc, and.iloc? Which ones are deprecated?How do you change the index of a DataFrame? How do you reset it?
How do you select rows based on index values using
loc[]?How can you perform slicing on a DataFrame with a datetime index?
What happens if your DataFrame has duplicate index values? How do you detect and handle that?
🔹 Debugging & Best Practices
What are common performance bottlenecks in Pandas scripts?
How do you debug “SettingWithCopyWarning” in Pandas?
What does
value_counts()do? How can it be used for categorical analysis?How do you check for outliers in a numeric column using Pandas?
How do you create a pivot table and add subtotals/grand totals using Pandas?
Last updated