Showing posts with label remove duplicate rows in sql select statement. Show all posts
Showing posts with label remove duplicate rows in sql select statement. Show all posts

Thursday, February 22, 2024

SQL - Data Cleaning Techniques | How to remove duplicates records

In this tutorial, you will learn "How to remove duplicates records " in SQL as a part of Data Cleaning techniques.
Data cleaning is an essential part of the data preprocessing pipeline to ensure that the data used for analysis or modeling is accurate, consistent, and reliable.

Removing duplicates: Duplicates in a dataset can skew analysis results. You can remove duplicates using the DISTINCT keyword or by using the GROUP BY clause.

Wednesday, February 14, 2024

SQL - Questions for some job interviews

As you are aware, the majority of competent interviewers will be more interested in your approach and problem-solving skills than in meticulously examining your syntax. 

Instead of obsessing over understanding every scenario that may occur, concentrate more on the resources in your toolkit. 

In most recent interview, we examined a schema and answered a few queries about joins, filtering, aggregations (including running totals), CTEs, and rank/windows.



The most common things, you should be needed in the SQL interview:
  • Simple things like Select Distinct, Count, Count Distinct, Min/Max, Sum.
  • Windows functions like Lag, Over, Partition by
  • Joins, using the filter clause in join to filter to one result instead of using where.
  • Group by, Order By

1. Can you explain the difference between where and having?

Answer - When it comes to the sequence in which executable operations are performed, "WHERE" is used to filter out things from the table, while "HAVING" is used in conjunction with aggregate functions like "SUM" or "AVG" to filter out certain aggregated items.

Example - Fruits Data Table

OrderDate

Fruit

Price

11 Feb. 2024

Apple

$10.00

12 Feb. 2024

Banana

$5.00

13 Feb. 2024

Banana

$7.50

14 Feb. 2024

Mango

$12.00

 SQL Query -

"SELECT Fruit, Sum(Price) as Total 
FROM Fruits
WHERE Fruit = 'Banana' 
GROUP BY Fruit 
HAVING  Sum(Price) > 10.00;"