This is Part Two of a N-Part Series on SQL and its importance as a Data Analyst (Link to Part 1: My Experience)

In my last blog post I introduced the “Every Data Analyst Should Learn SQL” series by outlining my experience with SQL from the perspective of a Data Analyst at The Data School. In this one I’d like to dive deeper into they why; why should you learn SQL as a Data Analyst.

SQL is the language of databases, and being the bread and butter of data analytics learning to speak it is a powerful feather in your bow. Let’s dive into 10 reasons to back that up:

  1. Efficient data manipulation

    • Relational databases are built for a handful of simple tasks; storing data, manipulating data, and serving it to clients when they ask. They are powerful, and efficient.
    • The language SQL is specifically designed to leverage the capabilities of every database that supports it.
    • Provided within SQL is a wide range of advanced querying capabilities bundled with a query optimiser that makes sure that whatever you execute is done as fast as possible. These querying capabilities include functions, expressions, and subqueries to manipulate and analyse data.
    • Indexes can be created with SQL which function like a table of contents making queries much faster. This efficiency is a bonus on top of the tools included to maintain data integrity; such as constraints, ACID transactions, and relationships between tables.
  2. Real-time data access

    • Compared to working with flat files like CSV, JSON or XML files, SQL databases enable real-time access to live data stores.
    • Working with live data means that you don’t have to wait on another stakeholder to send you up to date information just for it to be stale by the time it hits your desk.
    • Tools like Alteryx and Tableau, as well as many other BI tools interface directly with databases using SQL. Any of your workbooks, workflows or reports will have live up-to-date information whenever you need it.
  3. Large dataset support

    • A single SQL Server database has a maximum size limit of 524PB; that’s petabytes if you’re unfamiliar, it’s not a unit of measure we can easily wrap our head around. In contrast, an Excel file can only handle a bit over 1 Million rows before it gives up and says “no more”.
    • Not only can it handle bigger piles of data, it can handle them more efficiently. This makes SQL a good choice for data analysts working with large data sources.
    • SQL also has additional set of tools that can make large datasets more performant; stored procedures, triggers, and views.
      • Stored procedures are pre-defined SQL statements that can be run in a single call.
      • Triggers are a special kind of stored procedure that react to certain events in the database; either instead of or after an INSERT, UPDATE or DELETE statement.
      • Views are virtual tables defined as a result of a stored query on the database. Views can contain aggregations, join multiple tables and contain only subsets of the underlying data.
  4. Cross-platform compatibility

    • SQL is the standardised language designed to interact relational database management systems, as well as being the basis for querying some NoSQL databases (after all the No stands for “not-only”).
    • SQL became a standard of the American National Standards Institute (ANSI) in 1986, and the International Organisation for Standardisation (ISO) in 1987.
    • Each flavour of database may have different approaches to implementing SQL, but the underlying foundation stays mostly the same. Transact-SQL (T-SQL) is an extension of SQL used predominantly in SQL Server databases.
    • SQL databases work on all operating systems, on hardware of any size; from tiny microprocessors capable of running an SQLite database, to distributed databases running on a cluster of machines.
    • All major database management systems have Open Database Connectivity (ODBC) drivers which are APIs to connect to databases regardless of operating system or hardware.
  5. Flexibility

    • SQL allows data analysts to execute a wide variety of tasks on databases.
    • Data analysts with a knowledge of SQL can create databases and tables, as well as alter them.
    • SQL can also insert, update, and delete data, as well as control access to all aspects of the database.
  6. Integration with other tools

    • Because of the ubiquity of SQL databases most tools that interact with data have implementations to access them directly.
    • For tools that don’t have native connections to a particular database will likely offer the ability to use an ODBC driver.
    • Our top tools in The Data School, Alteryx and Tableau, have mature and sophisticated abilities to interact with SQL.
    • Databases, Data Warehouses, and Datalakes all commonly support SQL whether deployed on-premises or on the cloud.
  7. Collaboration

    • Many professionals across different domains use SQL to do their jobs; data analysts, data scientists, data engineers, as well as programmers all commonly use it.
    • By understanding SQL, data analysts are able to shorten the time-to-insight by sidestepping the need for intermediate data formats such as CSV, JSON, or even Excel spreadsheets.
    • One SQL database can serve as a central repository of data and be used by multiple tools in real-time, at the same time.
  8. Ease of use

    • SQL is one of the few languages that is written in plain English and is largely self documenting.
    • SQL is a progressive language with layers of complexity; queries can range from simple select statements, to complex queries with sub-selects, joins and unions.
    • Data analysts don’t have to be programmers to get work done in SQL, and you don’t need to boil the ocean with complexity to start getting value from the language.
  9. Speed

    • SQL is often faster than other tools for analysing and manipulating data, especially when working with large datasets.
    • Database management systems include a Query Optimiser that finds the most efficient and fast method for executing an SQL statement.
    • Queries can be be further optimised by creating indexes, and partitioning tables.
    • When working with large datasets where only a subset of columns or rows are necessary for analysis, SQL can be used to return only the columns and rows that are needed.
  10. Scalability

    • SQL databases can be deployed at any scale, with a wide range of data sources.
    • Cloud data warehouses such as Snowflake and BigQuery are used by some of the largest companies in the world, and they use SQL as the basis for interacting with their data.
    • Regardless of the size of your hardware, or the volume of customers or transactions you have, SQL is a one-tool-fits-all-sizes language.

There you go, Part 2 of Why Every Data Analyst Should Learn SQL. In this post we discussed ten of the many reasons why a SQL is a vital addition to your toolkit. In future blog posts I will take you through SQL databases in more depth looking at different problems it can solve for you.

Until next time, I hope you enjoy the holidays and I look forward to the next time.

Love,

Dan Lawson

Daniel Lawson
Author: Daniel Lawson

Right off the bat I can tell you that I’m not your average data analyst. I’ve spent most of my career running my own business as a photographer and videographer, with a sprinkling of Web Development and SEO work as well. My approach to life and work is very T-shaped, in that I have a small set of specific skills complemented by a very broad range of interests; I like to think of myself as a dedicated non-specialist. Data Analytics, and Programming, started as a hobby that quickly grew into a passion. The more I learned the more I looked for opportunities to pull, manipulate, and join data from disparate sources in my life. I learned to interact with REST APIs for services I used, personal data from services I use like Spotify, and health data captured by my devices. I learned SQL to create and query databases, as well as analyse SQLite files containing my iMessages and Photos data on my Mac. Every technique I learned opened up more possibilities; now I’m hooked and there’s no turning back. Learn More About Me: https://danlsn.com.au