SQL Interview Questions
Question: What is the difference between DELETE and TRUNCATE?
Answer:
Both DELETE and TRUNCATE are used to remove data from a table in SQL, but they differ in how they operate and the effect they have on the table structure, performance, and transaction logging.
1. DELETE:
- Definition: The
DELETE
statement is used to delete specific rows from a table based on a condition. - Syntax:
DELETE FROM table_name WHERE condition;
- Scope: You can specify a
WHERE
clause to delete specific rows. If noWHERE
clause is provided, all rows will be deleted. - Transaction Logging: The
DELETE
operation is fully logged in the transaction log, meaning each row deleted is logged individually. - Performance: Since each row deletion is logged,
DELETE
can be slower compared toTRUNCATE
, especially for large tables. - Triggers:
DELETE
can activate triggers (e.g.,AFTER DELETE
triggers). - Rollback: Since
DELETE
is part of a transaction, it can be rolled back if wrapped in a transaction. - Table Structure: It does not affect the structure of the table. The table structure, indexes, and constraints remain unchanged.
- Example:
DELETE FROM employees WHERE department = 'Sales';
2. TRUNCATE:
- Definition: The
TRUNCATE
statement is used to remove all rows from a table, but unlikeDELETE
, it does not log individual row deletions. - Syntax:
TRUNCATE TABLE table_name;
- Scope:
TRUNCATE
removes all rows in the table and cannot be used with aWHERE
clause. It is a bulk operation, and all rows are deleted without any condition. - Transaction Logging:
TRUNCATE
is a minimal logging operation. It logs the deallocation of the data pages used by the table rather than logging each row deletion. - Performance:
TRUNCATE
is typically faster thanDELETE
because it does not log each row individually and does not fire triggers. It is especially faster for large tables. - Triggers:
TRUNCATE
does not activateDELETE
triggers because it does not operate on individual rows. - Rollback: Like
DELETE
,TRUNCATE
can be rolled back if it is part of a transaction. However, it is often faster thanDELETE
due to reduced logging. - Table Structure: It does not affect the structure of the table. The table, its structure, and constraints remain intact. However, depending on the database,
TRUNCATE
may reset the identity column (if present). - Example:
TRUNCATE TABLE employees;
Key Differences:
Feature | DELETE | TRUNCATE |
---|---|---|
Rows Affected | Can delete specific rows based on a condition. | Deletes all rows from the table. |
Where Clause | Can use a WHERE clause to filter rows. | Cannot use a WHERE clause. |
Transaction Logging | Fully logs each row deletion. | Minimally logs deallocation of data pages. |
Performance | Slower, especially for large tables. | Faster for large tables, due to minimal logging. |
Triggers | Fires triggers (e.g., AFTER DELETE triggers). | Does not fire triggers. |
Rollback | Can be rolled back within a transaction. | Can be rolled back within a transaction, but faster. |
Table Structure | Does not affect the structure (indexes, constraints). | Does not affect the structure but may reset the identity column (depending on DB). |
Locking | May lock rows or the entire table. | Locks the entire table. |
Summary:
- DELETE is used when you want to selectively delete rows based on conditions and requires more transaction log space due to individual row deletions.
- TRUNCATE is a more efficient, bulk operation used to remove all rows from a table quickly, but it is limited to clearing the entire table and does not support conditions or row-based operations.
Question: What is normalization and denormalization in SQL?
Answer:
1. Normalization:
Normalization is the process of organizing data in a relational database to reduce redundancy and dependency by dividing large tables into smaller ones. The goal of normalization is to ensure that the database structure is efficient, with each piece of data being stored only once. This process involves applying a set of rules called normal forms to a database schema. Each “normal form” builds on the previous one, and the further you normalize, the more complex the structure becomes but also the more efficient it is in terms of storage and data integrity.
Key Objectives of Normalization:
- Eliminate redundancy: Avoid duplicate data in the database.
- Ensure data integrity: Enforce logical relationships and dependencies between different pieces of data.
- Reduce anomalies: Prevent anomalies (insertion, update, and deletion anomalies) that could occur when data is stored in a less structured format.
Normal Forms:
- 1st Normal Form (1NF): Ensures that each column contains atomic (indivisible) values, and each record is unique.
- 2nd Normal Form (2NF): Meets 1NF and also eliminates partial dependency, ensuring that every non-key column depends on the whole primary key.
- 3rd Normal Form (3NF): Meets 2NF and removes transitive dependency, i.e., non-key columns depend only on the primary key.
- Boyce-Codd Normal Form (BCNF): A stricter version of 3NF, removing any dependencies where a non-prime attribute is functionally dependent on another non-prime attribute.
Benefits of Normalization:
- Reduced redundancy (avoiding storing the same data in multiple places).
- Improved data integrity (less chance for inconsistent data).
- Better storage utilization (smaller tables and more efficient data storage).
- Avoidance of update, insert, and delete anomalies.
Example of Normalization:
Suppose you have a table of Orders
with redundant customer data.
OrderID | CustomerName | CustomerAddress | ProductName | Quantity |
---|---|---|---|---|
1 | John Doe | 123 Elm St | Widget A | 10 |
2 | Jane Smith | 456 Oak St | Widget B | 5 |
3 | John Doe | 123 Elm St | Widget C | 7 |
After normalization, you might split it into two tables:
Customers Table:
CustomerID | CustomerName | CustomerAddress |
---|---|---|
1 | John Doe | 123 Elm St |
2 | Jane Smith | 456 Oak St |
Orders Table:
OrderID | CustomerID | ProductName | Quantity |
---|---|---|---|
1 | 1 | Widget A | 10 |
2 | 2 | Widget B | 5 |
3 | 1 | Widget C | 7 |
2. Denormalization:
Denormalization is the process of deliberately introducing redundancy into a database by combining normalized tables or copying data across multiple places. This is done to improve read performance and simplify complex queries, especially in reporting or analytical applications where speed is more important than minimizing redundancy.
While normalization reduces redundancy and improves data integrity, denormalization can reduce the need for complex joins in queries, thereby improving query performance in certain cases. However, denormalization also comes with trade-offs, such as the potential for data anomalies and more complex data updates.
Reasons for Denormalization:
- Performance improvement: Denormalization can speed up read-heavy operations by reducing the need for multiple joins, especially when large amounts of data are being accessed frequently.
- Simplified queries: It can simplify complex queries or aggregations by including all required data in one place.
- Optimized for reporting/analytics: Denormalized data is often used in data warehouses where read performance is prioritized over data integrity.
Trade-offs of Denormalization:
- Redundancy: The same data might be stored in multiple places, which can lead to data inconsistencies.
- Update anomalies: If data is duplicated, updates might need to be made in multiple places, leading to potential inconsistencies.
- Higher storage costs: Since data is repeated, more storage space is needed.
Example of Denormalization:
For the previous example, denormalizing the Orders
and Customers
table into a single Orders
table:
OrderID | CustomerName | CustomerAddress | ProductName | Quantity |
---|---|---|---|---|
1 | John Doe | 123 Elm St | Widget A | 10 |
2 | Jane Smith | 456 Oak St | Widget B | 5 |
3 | John Doe | 123 Elm St | Widget C | 7 |
Here, you’ve combined customer data with order data, reducing the need for joins but introducing redundancy.
Key Differences Between Normalization and Denormalization:
Aspect | Normalization | Denormalization |
---|---|---|
Purpose | Reduces redundancy and ensures data integrity. | Improves query performance and simplifies data retrieval. |
Impact on Data | Avoids data redundancy. | Introduces data redundancy. |
Complexity | Increases complexity of queries (joins required). | Reduces complexity of queries (fewer joins). |
Data Integrity | Ensures high data integrity. | May lead to data inconsistencies and anomalies. |
Performance | Slower performance for read operations (joins). | Faster performance for read-heavy queries. |
Update Anomalies | Avoids update anomalies. | May lead to update, insert, and delete anomalies. |
Storage | More efficient in terms of storage (less duplication). | May use more storage due to redundancy. |
Use Case | OLTP (Online Transaction Processing) systems where data consistency is key. | OLAP (Online Analytical Processing) systems, reporting, and data warehouses. |
Summary:
- Normalization is about organizing data to minimize redundancy and avoid anomalies, while Denormalization involves intentionally adding redundancy to optimize read performance.
- Both have their place depending on the use case. Normalization is typically used in transactional systems, while denormalization is more common in systems focused on query performance, such as data warehouses.
Read More
If you can’t get enough from this article, Aihirely has plenty more related information, such as SQL interview questions, SQL interview experiences, and details about various SQL job positions. Click here to check it out.
Tags
- SQL
- SQL interview questions
- SQL joins
- SQL queries
- INNER JOIN
- LEFT JOIN
- RIGHT JOIN
- FULL JOIN
- Subqueries
- SQL normalization
- SQL denormalization
- Primary key
- Foreign key
- GROUP BY
- HAVING vs WHERE
- SQL views
- SQL indexes
- DENSE RANK
- ROW NUMBER
- SQL DISTINCT
- SQL SELECT INTO
- CHAR vs VARCHAR
- NULL values in SQL
- SQL performance optimization
- SQL aggregate functions
- SQL database design
- SQL DELETE vs TRUNCATE
- SQL DISTINCT
- SQL query optimization