Top 50 sql tricky interview questions free download
We're given a table of product purchases. Each row in the table represents an individual user product purchase. Write a query to get the number of customers that were upsold by purchasing additional products.
Note: If the customer purchased two things on the same day that does not count as an upsell as they were purchased within a similar timeframe. Given a table of students and their SAT test scores, write a query to return the two students with the closest test scores with the score difference. If there are multiple students with the same minimum score difference, select the student name combination that is higher in the alphabet.
Analytics SQL interview questions, which you might hear referred to as SQL case studies , are some of the trickiest interview questions that you will face. This is because they test two concepts:.
Analytics SQL interview questions are designed to test how you would think about solving a problem, and are purposely left more ambiguous than other types of problems. For example, an interviewer might ask you to write a SQL query given a few tables to understand which AB test variant won. But there might not even be any understanding of what winning actually means. Write a query to see which variant "won.
First touch attribution is defined as the channel to which the converted user was associated with when they first discovered the website. How do we figure out the beginning path of the Facebook ad and connect it to the end purchasing user?
We need to do two actions: 1 subset all of the users that converted to customers and 2 figure out their first session visit to attribute the actual channel. We can do that by creating a sub-query that only gets the distinct users that have actually converted. We're looking to understand the effect of a new Uber driver incentive promotion released in the past month on driver behavior. Write a query to figure out if the incentive worked as indicated.
Let's say we want to build a naive recommender. Write an SQL query to create a metric to recommend pages for each user based on recommendations from their friends liked pages.
Let's solve this problem by visualizing what kind of output we want from the query. Then the max value on our metric will be the most recommendable page. The first thing we have to do is then to write a query to associate users to their friends liked pages. We can do that easily with an initial join between the two tables. Amazon released a new recommendation widget on their landing page. Write a query to determine the impact the recommendation widget made on user behavior for one metric.
Write a query to show the number of users, number of transactions placed, and total order amount per month in the year Assume that we are only interested in the monthly reports for a single year January-December. Given a table of transactions and products , write a query to return the product id and average product price for that id. Only return the products where the average product price is greater than the average price of all transactions.
Given a table of product subscriptions with a subscription start date and end date for each user, write a query that returns true or false whether or not each user has a subscription date range that overlaps with any other user. Write a query to get the post success rate for each day in the month of January Hint: Let's see if we can clearly define the metrics we want to calculate before just jumping into the problem.
We want post success rate for each day over the past week. Additionally since the success rate must be broken down by day, we must make sure that a post that is entered must be completed on the same day. ETL stands for "Extract, Transfer, Load" and describes the process for which data flows between different data warehousing systems. Extract does the process of reading data from a database.
Transform converts data into a format that could be appropriate for reporting, analysis, machine learning, etc. In the interview, ETL tools and concepts are important to know for virtually all roles.
The more difficult interview questions, however, will likely be focused and asked in data engineering, business intelligence, and related interviews. If we set this query to run daily, it becomes a daily extract, transform, and load ETL process. Let's say you have a table with a billion rows.
How would you add a column inserting data from the original source without affecting the user experience? Before jumping into the question we should remember to clarify a few details that we can potentially get out of the interviewer. It helps to ask questions to understand and show that you can think holistically about the problem.
Rushing too fast into a solution is a red flag for many interviewers. Due to an ETL error, the employees table instead of updating the salaries every year when doing compensation adjustments, did an insert instead. The head of HR still needs the current salary of each employee.
Let's say that we have two ETL jobs that feed into a single production table each day. What is a Unique constraint? What is the difference between a unique key and primary key? A table can have more than one unique key column while there can be only one primary key. A unique key column creates a non-clustered index whereas the primary key creates a clustered index on the column.
What is a clustered index? Answer: Clustered indexes physically sort the rows in the table based on the clustering key by default primary key. A clustered index helps in the fast retrieval of data from the databases. There can be only one clustered index in a table. What is a Not Null constraint? Answer: A composite key is a primary key with multiple columns as in the case of some tables a single field might not guarantee unique and not null values, so a combination of multiple fields is taken as the primary key.
What is a Field in a Database? Answer: A field in a database table is a space allocated to store a particular record within a table. What is a Foreign Key?? Answer: A foreign key is used for enforcing referential integrity in which a field marked as a foreign key in one table is linked with the primary key of another table. With this referential integrity, we can have only the data in a foreign key which matches the data in the primary key of the other table. What is a non-clustered index?
Answer: Non clustered indexes have a jump table containing key-values pointing to a row in the table corresponding to the keys. There can be multiple clustered indexes in a table. What is a NULL value? What is Database Normalisation? Answer: Database normalization is the process of organization of data to reduce the redundancy and anomalies in the database.
Answer: For a table to be Third Normal Form, it must follow 2NF and each non-prime attribute must be dependent on the primary key of the table. What is the difference between cross join and full outer join? Answer: A cross join returns cartesian product of the two tables, so there is no condition or on clause as each row of table is joined with each row of tables whereas a full outer join will join the two tables on the basis of condition specified in the on clause and for the records not satisfying the condition null value is placed in the join result.
How can we remove orphan records from a table? Answer: To remove orphan records from the database we need to create a join on the parent and child tables and then remove the rows from the child table where id IS NULL.
What is the Cartesian product of the table? Answer: The output of Cross Join is called a Cartesian product. It returns rows combining each row from the first table with each row of the second table. What is DBMS? A Database Management System DBMS is a software application that interacts with the user, applications and the database itself to capture and analyze data.
A DBMS allows a user to interact with the database. The data stored in the database can be modified, retrieved and deleted and can be of any type like strings, numbers, images, etc. Relational Database Management System: The data is stored in relations tables. Example — MySQL. Example — Mongo.
What do you mean by table and field in SQL? Answer: A table refers to a collection of data in an organized manner in the form of rows and columns. A field refers to the number of columns in a table. For example:. What are the different subsets of SQL? It helps you to insert, update, delete and retrieve data from the database. Example — Grant, Revoke access permissions. Answer: Both of these data types are used for characters but varchar2 is used for character strings of variable length whereas char is used for character strings of fixed length.
For example, if we specify the type as char 5 then we will not be allowed to store string of any other length in this variable but if we specify the type of this variable as varchar2 5 then we will be allowed to store strings of variable length, we can store a string of length 3 or 4 or 2 in this variable. Write SQL query to get the second highest salary among all Employees? Write SQL query to get the nth highest salary among all Employees.
What are the reasons for de-normalizing the data? What is a Pseudocolumn? SubQuery is always executed first, and the result of subquery is passed on to the main query. A correlated subquery cannot be considered as independent query, but it can refer the column in a table listed in the FROM the list of the main query.
A Non-Correlated sub query can be considered as independent query and the output of subquery are substituted in the main query.
Stored Procedure is a function consists of many SQL statement to access the database system. Several SQL statements are consolidated into a stored procedure and execute them whenever and wherever required.
A DB trigger is a code or programs that automatically execute with response to some event on a table or view in a database. Mainly, trigger helps to maintain the integrity of the database. Example: When a new student is added to the student database, new records should be created in the related tables like Exam, Score and Attendance tables.
Commit and Rollback can be performed after delete statement. Local variables are the variables which can be used or exist inside the function. They are not known to the other functions and those variables cannot be referred or used.
Variables can be created whenever that function is called. Global variables are the variables which can be used or exist throughout the program. Same variable declared in global cannot be used in functions. Global variables cannot be created whenever that function is called. Constraint can be used to specify the limit on the data type of table. Constraint can be specified while creating or altering the table statement. Sample of constraint are. Data Integrity defines the accuracy and consistency of data stored in a database.
It can also define integrity constraints to enforce business rules on the data when it is entered into the application or database. Auto increment keyword allows the user to create a unique number to be generated when a new record is inserted into the table. Clustered index is used for easy retrieval of data from the database by altering the way that the records are stored. Database sorts out rows by the column which is set to be clustered index. A nonclustered index does not alter the way it was stored but creates a complete separate object within the table.
It point back to the original table rows after searching. Datawarehouse is a central repository of data from multiple sources of information. Those data are consolidated, transformed and made available for the mining and online processing. Warehouse data have a subset of data called Data Marts. Self-join is set to be query used to compare to itself. This is used to compare values in a column with other values in the same column in the same table.
Cross join defines as Cartesian product where number of rows in the first table multiplied by number of rows in the second table. User defined functions are the functions written to use that logic whenever required. It is not necessary to write the same logic several times.
0コメント