SQL Interview Questions

author image Hirely
at 08 Jan, 2025

Question: What is the difference between DELETE and TRUNCATE?

Answer:

Both DELETE and TRUNCATE are used to remove data from a table in SQL, but they differ in how they operate and the effect they have on the table structure, performance, and transaction logging.

1. DELETE:

  • Definition: The DELETE statement is used to delete specific rows from a table based on a condition.
  • Syntax:
    DELETE FROM table_name WHERE condition;
  • Scope: You can specify a WHERE clause to delete specific rows. If no WHERE clause is provided, all rows will be deleted.
  • Transaction Logging: The DELETE operation is fully logged in the transaction log, meaning each row deleted is logged individually.
  • Performance: Since each row deletion is logged, DELETE can be slower compared to TRUNCATE, especially for large tables.
  • Triggers: DELETE can activate triggers (e.g., AFTER DELETE triggers).
  • Rollback: Since DELETE is part of a transaction, it can be rolled back if wrapped in a transaction.
  • Table Structure: It does not affect the structure of the table. The table structure, indexes, and constraints remain unchanged.
  • Example:
    DELETE FROM employees WHERE department = 'Sales';

2. TRUNCATE:

  • Definition: The TRUNCATE statement is used to remove all rows from a table, but unlike DELETE, it does not log individual row deletions.
  • Syntax:
    TRUNCATE TABLE table_name;
  • Scope: TRUNCATE removes all rows in the table and cannot be used with a WHERE clause. It is a bulk operation, and all rows are deleted without any condition.
  • Transaction Logging: TRUNCATE is a minimal logging operation. It logs the deallocation of the data pages used by the table rather than logging each row deletion.
  • Performance: TRUNCATE is typically faster than DELETE because it does not log each row individually and does not fire triggers. It is especially faster for large tables.
  • Triggers: TRUNCATE does not activate DELETE triggers because it does not operate on individual rows.
  • Rollback: Like DELETE, TRUNCATE can be rolled back if it is part of a transaction. However, it is often faster than DELETE due to reduced logging.
  • Table Structure: It does not affect the structure of the table. The table, its structure, and constraints remain intact. However, depending on the database, TRUNCATE may reset the identity column (if present).
  • Example:
    TRUNCATE TABLE employees;

Key Differences:

FeatureDELETETRUNCATE
Rows AffectedCan delete specific rows based on a condition.Deletes all rows from the table.
Where ClauseCan use a WHERE clause to filter rows.Cannot use a WHERE clause.
Transaction LoggingFully logs each row deletion.Minimally logs deallocation of data pages.
PerformanceSlower, especially for large tables.Faster for large tables, due to minimal logging.
TriggersFires triggers (e.g., AFTER DELETE triggers).Does not fire triggers.
RollbackCan be rolled back within a transaction.Can be rolled back within a transaction, but faster.
Table StructureDoes not affect the structure (indexes, constraints).Does not affect the structure but may reset the identity column (depending on DB).
LockingMay lock rows or the entire table.Locks the entire table.

Summary:

  • DELETE is used when you want to selectively delete rows based on conditions and requires more transaction log space due to individual row deletions.
  • TRUNCATE is a more efficient, bulk operation used to remove all rows from a table quickly, but it is limited to clearing the entire table and does not support conditions or row-based operations.

Question: What is normalization and denormalization in SQL?

Answer:

1. Normalization:

Normalization is the process of organizing data in a relational database to reduce redundancy and dependency by dividing large tables into smaller ones. The goal of normalization is to ensure that the database structure is efficient, with each piece of data being stored only once. This process involves applying a set of rules called normal forms to a database schema. Each “normal form” builds on the previous one, and the further you normalize, the more complex the structure becomes but also the more efficient it is in terms of storage and data integrity.

Key Objectives of Normalization:

  • Eliminate redundancy: Avoid duplicate data in the database.
  • Ensure data integrity: Enforce logical relationships and dependencies between different pieces of data.
  • Reduce anomalies: Prevent anomalies (insertion, update, and deletion anomalies) that could occur when data is stored in a less structured format.

Normal Forms:

  • 1st Normal Form (1NF): Ensures that each column contains atomic (indivisible) values, and each record is unique.
  • 2nd Normal Form (2NF): Meets 1NF and also eliminates partial dependency, ensuring that every non-key column depends on the whole primary key.
  • 3rd Normal Form (3NF): Meets 2NF and removes transitive dependency, i.e., non-key columns depend only on the primary key.
  • Boyce-Codd Normal Form (BCNF): A stricter version of 3NF, removing any dependencies where a non-prime attribute is functionally dependent on another non-prime attribute.

Benefits of Normalization:

  • Reduced redundancy (avoiding storing the same data in multiple places).
  • Improved data integrity (less chance for inconsistent data).
  • Better storage utilization (smaller tables and more efficient data storage).
  • Avoidance of update, insert, and delete anomalies.

Example of Normalization: Suppose you have a table of Orders with redundant customer data.

OrderIDCustomerNameCustomerAddressProductNameQuantity
1John Doe123 Elm StWidget A10
2Jane Smith456 Oak StWidget B5
3John Doe123 Elm StWidget C7

After normalization, you might split it into two tables:

Customers Table:

CustomerIDCustomerNameCustomerAddress
1John Doe123 Elm St
2Jane Smith456 Oak St

Orders Table:

OrderIDCustomerIDProductNameQuantity
11Widget A10
22Widget B5
31Widget C7

2. Denormalization:

Denormalization is the process of deliberately introducing redundancy into a database by combining normalized tables or copying data across multiple places. This is done to improve read performance and simplify complex queries, especially in reporting or analytical applications where speed is more important than minimizing redundancy.

While normalization reduces redundancy and improves data integrity, denormalization can reduce the need for complex joins in queries, thereby improving query performance in certain cases. However, denormalization also comes with trade-offs, such as the potential for data anomalies and more complex data updates.

Reasons for Denormalization:

  • Performance improvement: Denormalization can speed up read-heavy operations by reducing the need for multiple joins, especially when large amounts of data are being accessed frequently.
  • Simplified queries: It can simplify complex queries or aggregations by including all required data in one place.
  • Optimized for reporting/analytics: Denormalized data is often used in data warehouses where read performance is prioritized over data integrity.

Trade-offs of Denormalization:

  • Redundancy: The same data might be stored in multiple places, which can lead to data inconsistencies.
  • Update anomalies: If data is duplicated, updates might need to be made in multiple places, leading to potential inconsistencies.
  • Higher storage costs: Since data is repeated, more storage space is needed.

Example of Denormalization: For the previous example, denormalizing the Orders and Customers table into a single Orders table:

OrderIDCustomerNameCustomerAddressProductNameQuantity
1John Doe123 Elm StWidget A10
2Jane Smith456 Oak StWidget B5
3John Doe123 Elm StWidget C7

Here, you’ve combined customer data with order data, reducing the need for joins but introducing redundancy.


Key Differences Between Normalization and Denormalization:

AspectNormalizationDenormalization
PurposeReduces redundancy and ensures data integrity.Improves query performance and simplifies data retrieval.
Impact on DataAvoids data redundancy.Introduces data redundancy.
ComplexityIncreases complexity of queries (joins required).Reduces complexity of queries (fewer joins).
Data IntegrityEnsures high data integrity.May lead to data inconsistencies and anomalies.
PerformanceSlower performance for read operations (joins).Faster performance for read-heavy queries.
Update AnomaliesAvoids update anomalies.May lead to update, insert, and delete anomalies.
StorageMore efficient in terms of storage (less duplication).May use more storage due to redundancy.
Use CaseOLTP (Online Transaction Processing) systems where data consistency is key.OLAP (Online Analytical Processing) systems, reporting, and data warehouses.

Summary:

  • Normalization is about organizing data to minimize redundancy and avoid anomalies, while Denormalization involves intentionally adding redundancy to optimize read performance.
  • Both have their place depending on the use case. Normalization is typically used in transactional systems, while denormalization is more common in systems focused on query performance, such as data warehouses.

Read More

If you can’t get enough from this article, Aihirely has plenty more related information, such as SQL interview questions, SQL interview experiences, and details about various SQL job positions. Click here to check it out.

Related Posts

Trace Job opportunities

Hirely, your exclusive interview companion, empowers your competence and facilitates your interviews.

Get Started Now