38 MUST know SQL Interview Questions for Any Aspiring Data Professional!

Do you often struggle with the finer points of SQL? Do you wonder why SQL is such an important tool in the data professional’s toolkit? Well, you’re just in the right place. In this article, we will try and understand what makes SQL such a powerful tool for data professionals, and also brush up on all related concepts so you’re prepared for any interview question that might come your way.

Why SQL?

Over the years, data processing has seen several changes, and many technological advancements. However, there’s been one constant in all these changes, and that is none other than SQL, affectionately called sequel.

Let’s look at some key reasons why data professionals need to know SQL:

  • When you talk structured data, you talk SQL!
  • Even NoSQL databases use query languages with a syntax similar to SQL. So, knowledge of SQL will definitely help you in the long run.
  • If you’re working on Hadoop, Apache Hive provides you with HiveQL. It’s a SQL-like querying language that is used to manage and manipulate data
  • If you’re performing some analytics tasks on data stored on an SQL server, no language comes close to SQL

Now let’s get to the real business. We’ve compiled a list of questions we thought were important keeping your interview in mind.

Read on!

Table of Contents hide

1. Have you ever worked on SQL?       

This question is often used as an ice-breaker, to help get the conversation moving. It is a critical question, the answer to which will set the tone for the rest of your interview. If you’ve worked with SQL before, it will definitely give you an edge over others. You should explain where and how you used SQL in your earlier projects, your role, and the challenges you faced.

If you’ve not used SQL before, gracefully admit it. Maybe your role was different, or you worked on non-SQL databases, give your reasoning without giving an excuse.

2. Is SQL a programming language? How is it different from other programming languages?

With this question, the interviewer would want to test your general awareness of the evolution of programming languages.

To answer the question, yes, SQL is a language, with its own syntax and constructs. However, it’s not a language in the same sense as say C, C++, or Java. SQL is considered a fourth generation language (4GL), whereas the others mentioned are third generation languages. 4GL are generally languages that are closer to human language (unlike Java, or other 3GLs) and are generally used to access databases to manipulate or update information.

3. What do you mean by a relational database, and what is a relation?

Before jumping on to the definitions, it’s better to start by giving some examples where you’ve seen a database, or worked on it.

You can then move on to answering the question. A relational database is a collection of data items organised in tables. From these tables, we can access the data using a querying language, for example, SQL.

A relation in a database should not be confused by the mathematical term “relations”. Instead, in databases, a relation refers merely to a table because it stores the relation between data in its row-column format.  

The columns are attributes, while rows represent data records.

4. What is a tuple?

A single row in a relation (table) is called a tuple by database designers. Thus, a table can is essentially a combination of various tuples.

5. What do you mean by ACID properties?

ACID is an acronym that stands for atomicity, consistency, integrity, and durability. Here too, before explaining each property in detail, a better option would be to explain their importance.

Why are they important? Well, the ACID properties of a DBMS are needed for safe data sharing. Without these properties, everyday occurrences, such as online purchasing would become really tricky.

Imagine a situation where two people are buying the same product at the same time. Without these properties, it would be almost impossible for the merchant to keep these transactions from overlapping with one another.

6. What are the subsets or components of SQL?

SQL has three major components or subsets, depending on the actions they perform.

  1. DDL: It stands for Data Definition Language. It is used to create or defines schemas in a database. Commands like CREATE, ALTER, or DROP are a part of DDL.
  2. DML: It stands for Data Manipulation Language. As the name explains, it is used to manipulate data. It uses commands like SELECT, INSERT, UPDATE, MERGE, etc. Mostly, when we have to work with the data present in the database and not with its structure, we use DML.
  3. DCL: DCL or Data Control Language is primarily used to control access to the data stored in a database. It uses commands like GRANT, and REVOKE.

Other than this, it has one more component, i.e., TCL, or Transaction Control Language. These commands are used to manage the transactions in a database. Commands like COMMIT, ROLLBACK, and SAVEPOINT come under this subset.

7. Where do the comparison operators find use in SQL?

SQL provides you with following comparison operators:

These comparison operators find use in the WHERE clause, to select matching records as per the user’s requirement. You can go ahead by giving an example of a basic query like:

SELECT *,

FROM students,

WHERE city=”Delhi

This query will select all students from the students table whose city attribute us “Delhi”.

8. What is a view in SQL?

View is a virtual table through which we can observe some selective portion of data from one or more tables. Views are used to hide data complexity by restricting access to the database.

A view is stored as a SELECT statement in the database. The following query will create view:

     CREATE VIEW view_name

     AS

     SELECT column_list

     FROM table_name [WHERE condition];

9. What is an SQL server? Name a few SQL servers.

So far in our discussion, one conclusion is indisputable. SQL is a language that’s used to play with the data present in the databases. One question arises here: where do we store this data?

This is where the SQL server comes in!

Now, the SQL server is a relational database management system (RDBMS). Focus on the words “relational database”, and “management system”. This essentially means that any data housed on the SQL server will be in the form of a relational database. It also implies that the server is an entire “management system”, and not just a database.

You get a variety of SQL servers from different vendors. Some of them are: PostgreSQL, SQLite, MySQL, and MS SQL.

10. Name the type of keys present in SQL.

Keys play a very important role in databases. Simply put, they are the backbone of the entire database as they help in maintaining relationships between tables, and also assist in identifying unique rows from the table. SQL provides us with the following types of keys:

  • Super Key
  • Candidate Key
  • Primary Key
  • Foreign Key
  • Alternate Key
  • Composite/Compound Key
  • Unique Key

11. Write a query to create a table with primary key, foreign key, and unique key.

This query creates a table named “Student” with ID as the primary key, EnrollNo as a unique key, and DeptID as the foreign key.

For creating a primary key, just write the keyword “PRIMARY KEY” during the attribute declaration. For unique key, use the keyword “UNIQUE”.

But to create a foreign key, we’ll have to link it to another table as well, so we use the syntax:

DeptID int FOREIGN KEY REFERENCES Department(DeptID)

This statement tells the database that the DeptId attribute of Student table will reference the department table.

12. What conditions should be kept in mind while defining a field as primary key?

Choosing a primary key is one of the most important tasks while defining tables in a database. Primary keys serve a special purpose by providing row level accessibility. Apart from this, primary keys are also important while linking multiple tables using foreign keys.

A few points should be kept in mind while declaring a primary key:

  1. The value should be unique across all rows, that is, if you’ve defined “studId” as the primary key for the “students” table, you cannot have two rows with the same value.
  2. No row should be NULL or empty.
  3. Values of primary key columns can never be modified. Hence, it’s important to keep that field as primary key whose value is less prone to modification: like Ids, or phone numbers.

13. What purpose does the foreign key serve?

Foreign keys are used to act as a connection or link between two tables. In essence, the foreign key of one table references the primary key of some other table.

14. What is meant by aggregate functions in SQL?

In database management, aggregate functions are functions that group the values of multiple rows to form a single value of greater significance.

For example, suppose you have a table which contains the wages of all your employees, and you need to find the average salary. For that, we have an aggregate function!

SQL provides us with the following aggregate functions:

  1. MIN – returns the smallest value
  2. SUM – returns the sum of all numeric values in a column
  3. COUNT – returns the count (total number) of values in a column
  4. COUNT(*) – returns the number of rows in a table

Example: SELECT AVG(salary)

FROM employee

For an employee table with salary as a column, this query will return the average of salaries of all the employees present in the table.

15. What is an ‘index’ in SQL?

An index is a tool used to speed up the performance of queries. It is a B-Tree type implementation which reduces the time required in fetching the data from a database.

Without an index, SQL Server would need to scan all the data to find the required data satisfying the user query, but due to the B-Tree structure, this complexity gets reduced.

This is the syntax: CREATE INDEX index_name

     ON table_name (column1, column2, …);

16. What is the difference between clustered and non-clustered index and when should we use which?

In a clustered index, we physically store the data in the order of the index. Therefore these are faster to read as compared to non-clustered index. But a non-clustered index, on the other hand, is quicker for insertion and updation operations.

We can have only one clustered index per table, but non-clustered can be used multiple times.

Although both the indexes improve the querying speed, the clustered index is primarily used on an auto incremental field, like a timestamp. The reason for this is the slow insert/update operations on a clustered index.

17. What are triggers and how do they maintain database integrity?

Triggers are stored procedures that fire automatically when an UPDATE, INSERT, or DELETE statement is issued against a table or row.

Consider a database in which deletion of data from one table will cause inconsistency across all tables. In such cases, we can have a trigger that’ll delete the same data across the whole database and maintain the database’s integrity!

18. Explain CASE statement with an example.

A CASE statement allows us to embed an if-else like clause in the SELECT statement.

Example: SELECT Employee_Name, CASE Location

                WHEN ‘Boston’ THEN Bonus * 2

                WHEN ‘Austin’ THEN Bonus * ,5

               ELSE Bonus

               END

               FROM Employee;

For those of you familiar with the basics of programming, the query above will look like just an if-else condition.

If you break down the query, you’ll see that it essentially commands to select employee names from the database, and then based on their location, it modifies the value of the bonus.

19. Name the types of triggers.

There are four types of triggers present in SQL.

  • DDL Triggers: These are stored procedures like regular triggers, but these are fired in a response to a variety of DDL events. These can be used for performing administrative tasks like auditing, or regulating database operations.
  • DML Triggers: The triggers that fire when a Data Manipulation Language (DML) event affects the table are knows as DML Triggers.
  • CLR Triggers: CLR stands for Common Language Runtime. These triggers can reference the data in inserted and deleted tables. They can also determine which table has been modified using the UPDATE operation.
  • Logon Triggers: These triggers are fired in response to a log-on event, i.e., these are raised when a new user session is established with an instance of SQL server.

20. What is the difference between DELETE and TRUNCATE in SQL?

DELETE TRUNCATE
It is a DML command It is a DDL command
It is used to delete a particular row It is used to remove all the rows
This can be rolled back Truncate cannot be rolled back

 

21. What do you mean by “Joins”  in a database?

Join clause in SQL combines columns from two or more tables in a database depending on the condition or the type of join used.

There are four types of joins in SQL.

  • Inner Join
  • Left Outer Join
  • Right Outer Join
  • Full Outer Join

22. Differentiate between inner and outer joins with an example.

An Inner join on table A and B will return the set of only those records which are present in both A and B based on the joining condition (also called predicate). If no match is present, then the inner join returns a NULL set.

Example:  SELECT dept.name DEPARTMENT, emp.name EMPLOYEE
        FROM DEPT dept, EMPLOYEE emp
        WHERE emp.dept_id = dept.id

In the above query, we are taking an inner join on dept and emp tables. It’ll return all the rows where emp.dept_id matches with the dept.id, as specified in the predicate.

Outer Joins, on the other hand, will return the matching data from both the tables and also the unmatched data, depending on if the join is full outer or single outer.

23. What is a self-join and why is it required?

As the name suggests, a self-join is the result of taking the join of a table with itself. It is useful in converting a hierarchical structure into a flat structure.

24. What is the difference between join and union in SQL?

Join combines two tables depending on the matching condition and returns the set of matched rows (with, or without extra rows). On the other hand, UNION is used to combine the result of two or more queries into a single result set that includes all the rows. For union operation, the two tables should have columns with the same datatype.

25. What is a subquery in SQL?

A subquery, or a nested query, is a query inside a query. The parent query is called the main query, while the inner is called subquery.

The inner query is always executed first, and then its result is passed on to the parent query for computation.

26. Which operator is used in a query for pattern matching?

The LIKE operator is used for pattern matching, in two ways:

  • % : matches 0 or more characters
  • _ (underscore) : Matches exactly 1 character

27. Write a query to fetch alternate rows from a table.

Consider a table named emp (employee) with rowno (Row Number), and empId (Employee ID) as its two columns.

For odd rows:

SELECT empId from(

SELECT rowno, empId from(

emp)where mod(rowno,2)=1

))

For even rows:

SELECT empId from(

SELECT rowno, empId from(

emp)where mod(rowno,2)=0

))

28. Write an SQL query to fetch duplicates record from a table.

Consider the table given in the image above. As you can see, there are multiple records present in this table. The question is to write a query that’ll display these duplicate rows as the result.

For this, we can make use of both GROUP BY, as well as HAVING clause.

SELECT *
FROM table WHERE id in (SELECT id FROM table GROUP BY id HAVING count(id)>1)

Apart from this approach, we can also use the join approach. Take a look at this query, for instance:

SELECT t1.*

FROM `table` t1

JOIN `table` t2 ON t1.keyword = t2.keyword

WHERE t1.id != t2.id

29. Write an SQL query to create a new table with the same data and structure as an old table.

We can use the SELECT INTO command for this.

SELECT * INTO newTable FROM oldTable;

30. Write a query to get the position of ‘a’ in the name ‘Sam’ present in student table

SELECT CHARINDEX(‘a’, FIRST_NAME, 0) from student where first_name=’Sam’

31. What is a ‘constraint’ in SQL? What are its types?

Constraints are used to maintain data integrity by specifying a limit on the data type. These can be specified while creating or altering a table.

There are multiple types of constraints:

  • NOT NULL
  • CHECK
  • DEFAULT
  • UNIQUE
  • PRIMARY KEY
  • FOREIGN KEY

32. What is a cursor in SQL?

A cursor is a tool that makes traversal across rows in a table quite easy by enabling the manipulation of whole result sets at once. It can be thought of as a pointer to one row in a set of rows. For operations such as retrieval, addition, or removal of database records, cursor comes in very handy.

The following types of cursors are present in SQL:

  • Static Cursors
  • Dynamic Cursors
  • Forward Only Cursors
  • Keyset Driven Cursors

33. What is normalisation?

When we have large datasets, there will be some redundancy. The process of removing that redundancy to reduce the database overheads is known as normalisation.

The main aim of normalisation is to remove insertion, updation, and deletion anomalies.

For example, you have a single table that stores employee and department details. Now, if you are inserting the detail of an employee, you’ll also need to enter the department details for each employee. Because of this, there’ll be a lot of duplicate data. This is called an insertion anomaly.

Now if you need to update the department details, you’ll have to manually update the value of department attribute of  all the employees of that department. And if you miss some employees while updating, it’ll render our database inconsistent! This is know as an updation anomaly.

Lastly, if you want to delete the department detail for any one department, you’ll have to manually delete the same detail in the employee table too. This might lead to a deletion anomaly.

To take care of all these anomalies, we decompose the table into 2 tables: employees, and department. Now, we can link these tables by making use of a foreign key. This will save us from all of these anomalies!

34. What is denormalisation and when is it used?

Denormalisation includes adding redundant data to an already normalised database. This is essentially done to reduce problems associated with database queries that combine data from multiple tables into one table.

It also improves the read performance of a database, at the expense of losing some write performance.

35. What are the different forms of normalisation?

Normalisations are of five forms:

  • First Normal Form (1NF): This removes all the duplicates from the table
  • Second Normal Form (2NF): Should satisfy 1NF, plus create separate tables for data subsets and define a relationship between tables using keys.
  • Third Normal Form (3NF): Should satisfy 2NF, plus remove all the columns that are not dependent on primary key constraints.
  • Fourth Normal Form (4NF): Should satisfy 3NF, plus should not have any multivalued dependencies.

36. What are the case manipulation functions available in SQL?

There are majorly three case manipulation functions present:

  • LOWER: returns the string in lowercase
  • UPPER: returns the string in uppercase
  • INITCAP: Returns the string with the first letter in uppercase and the rest in lowercase.

37.  Correct the following query:

SELECT UserId, AVG(Total) AS AvgOrderTotal

FROM Invoices

HAVING COUNT(OrderId) >= 1 (ref article)

The above query lacks the ‘Group By’ clause. The corrected query is:

SELECT UserId, AVG(Total) AS AvgOrderTotal

FROM Invoices

GROUP BY Userid

HAVING COUNT(OrderId) >= 1

38. What is the difference between CHAR and VARCHAR2 datatype in SQL?

This is a tricky one! While both of these are used for character strings, VARCHAR2 is used for character strings of variable length while CHAR is used for fixed length.

For example, if we define the type as char(2), then only the strings containing 2 characters will be allowed. But, for VARCHAR(2), 1 character will also work.

Conclusion

We promise you, having gone through the above questions, you’ll feel much more comfortable, and confident while sitting for your next interview.

Also, If you’re an aspiring data professional but uncomfortable with SQL, we recommend you to head straight ahead and take a few SQL courses. There’s nothing like getting your hands down and dirty.

Don’t forget to tell us your interview experience in the comments below.

Total
0
Shares
Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Related Posts