During the month of December I wanted to do something fun while practicing my creativity and video editing skills. I also desperately needed a reason to finally get furniture and equipment for my office. I decided to purchase an advent calendar and participate in the unboxing video trend, which I thought made the most sense with it being Christmas time and all.
Let's start with a little bit of background first. The word Advent (from Latin adventus, “coming”) is the beginning of the spiritual year, and it's observed the four Sundays before Christmas Day. An Advent calendar is used to count the days of Advent in anticipation of Christmas. Most popular advent calendars are full of chocolate, small toys, and even some come with adult beverages. The advent calendar I got was packed full of rubber ducks, why? Because they are a data professionals best friend!!
Do you purchase advent calendars at this time of year? Which is your favorite theme?
The title of this unboxing video series was December Data Ducks, and I shared all kinds of fun data facts, along with some duck facts to help with the learning process. The facts include things such as rubber duck debugging, SQL programming, data quality, and my favorite data security. Let's dive into each 🦆
Rubber Duck Debugging
I mentioned already that rubber ducks are a data professionals best friend, but did you know that they can be data scientists too? There are some data professionals that talk to rubber ducks to help them solve problems while they are programming. The technical term is called Rubber Duck Debugging which is a technique that is used by many programmers when they are stuck. Either by finding bugs in their code or by coming up with the right way to structure your code to make things work.
The technique is simple but the execution can feel a bit silly. Basically, you just gotta talk out loud and explain to the duck what your code is supposed to do (make sure to look the duck in the eyes while doing this). Then, it is critical to go into detail, line by line (at this point, you both should be looking at the screen and it is even more helpful if you guide the duck around the screen, this gives both of you a closer and more in depth look at the code). At some point while either talking about or reviewing the code, I promise that you will face your ah ha moment. Admittedly, you do not have to talk to a rubber duck per se. You can always use a coworker or even your pet. The main requirement is that you talk out loud.
An article from Harvard Business Review talks about an effective learning strategy called self-explaining, which involves asking yourself explanatory questions like 'What am I trying to accomplish?' or 'How should this work?'. One study actually showed that people who explained ideas to themselves learned almost three times more than those that internalized these 'conversations'. This article goes into some other great tips, one of which I practice myself > to summarize. Summarizing is a simple way to engage in self-explaining, and the act of putting an idea into our own words can promote learning.
Have you ever used Rubber Duck Debugging to help you solve problems?
SQL Programming
Top 3 Common SQL Mistakes
⚠ Using an Invalid Statement Order
⚠ Misspelling Commands
⚠ Forgetting Brackets and Quotes
Each of these mistakes will throw some error message but there are other mistakes that do not result in errors, they result in bad analysis!
🤔 Counting NULL Columns
🤔 Using BETWEEN Incorrectly
🤔 Ignoring the integrity of JOIN (provided by David Mullin)
To help you figure out what is wrong, grab a Rubber Duck friend and start Rubber Duck Debugging.
Talk out loud and explain the problem
Include what the code is supposed to do
Walk through the code (slowly and out loud) line by line
If you both have had your coffee and your reading glasses on, there is a strong chance that you will soon find your answer 🤓
It can also help you in your query building if you practice following best practices.
🙅🏻♀️ Don't Use SELECT *
🙅🏻♀️ Don't Use DISTINCT
🙌🏻 ALWAYS Comment Your Code
If you are curious to learn more about these tips/tricks/best practices, I provided walkthrough explanations on my YouTube channel
If you are looking to learn SQL yourself, consider checking this this platform which provides great tutorials for learning all kinds of programming languages!
What other common SQL mistakes, best practices, and/or tips can your share?
Data Quality
Data quality is critical for any analysis if you want to produce meaningful insights. Like they say, garbage in garbage out.
6 Dimensions of Data Quality
- Completeness
- Accuracy
- Consistency
- Validity
- Uniqueness
- Integrity
Completeness refers to the degree to which all data in a data set is available. It also measures if the data is sufficient to deliver meaningful inferences and decisions. Make sure that you include all the data points in order to make the best decision.
🦆 If you omit any features of a duck, you may mistake it for a goose.
Accuracy is the level to which data represents a real-world scenario and confirms with a verifiable source.
🦆 Even though you might have fed ducks as a child and the ducks ate that bread,
this does not create an accurate data point that bread is good for ducks.
Consistency refers to whether the same data kept at different places do or do not match and is also critical when making comparisons or joins.
🦆 Ducks are not consistent with the way that they quack,
they actually have regional accents.
Validity is about data that conforms to the syntax (format, type, range) of its definition. Using business rules to set data types can help ensure you are capturing valid data.
🦆 Similar to business rules, ducks can understand commands.
Uniqueness is the most critical dimension for ensuring no duplication or overlapping values across all data sets. For your data to be the most meaningful, you want to ensure that there are no duplicates.
🦆 There is a vast unique set of terms that you can use when referring to ducks,
including: duckling, drake, gen, raft, paddling, chick, or flock.
Integrity relates to the maintenance and assurance of the overall accuracy, completeness, and consistency of data and requires periodic monitoring.
🦆 Ducks carefully monitor their surroundings
so they know what is going on at all times.
Do you have any fun data quality stories?
Data Security
Data Security is the practice of protecting information from unauthorized access, corruption, or theft. It is the responsibility of every employee in an organization (and you personally) to keep data safe.
4 Elements of Data Security
Physical
Digital
Operational
Administrative
Physical Security refers to protecting information and equipment on premise. Keep the physical space in which you work free from any things that could put your data at risk of being stolen, keep those filing cabinets locked!
🦆 Ducks are meticulously clean animals.
Digital Security refers to protecting information on systems and networks. A company can keep digital data safe by implementing measures like encryption, firewalls, anti-virus software. You can even keep your personal data safe by using a virtual private network (VPN).
🦆 Ducks protect themselves by either flying or swimming away
and female ducks are usually brown in color
so it makes them harder to be seen by predators.
Operational Security is protecting information from risks inside of an organization. Insider risks are actually more of a threat to a company's data security than outside risks. Oftentimes, they make mistakes which can lead to the compromise of data. Make sure that you are offering solid security training so everyone inside the organization knows how to keep the data safe.
🦆 Ducklings communicate with each other while they are inside the egg
which allows them to know the perfect time to hatch.
Administrative Security is protecting information from risks outside of an organization. While you cannot change the environment around you, you can prepare and protect yourself. Implementing things such as audit controls and getting cybersecurity insurance can help from any risks coming form the outside.
🦆 Ducks have the ability to stay safe from the outside environments
by having waterproof feathers.
Levels of Data Security
Public: available to everyone (the public)
Internal: available to employees internal to an organization
Confidential: available only to those that have been authorized, generally restricted to smaller teams in an organization
Restricted: available only to those that have been authorized and any unauthorized access can lead to fines, criminal charges, or irreparable damage to the company
Do you practice Data Security?
Learning with Rubber Ducks!
Finally, rubber ducks can be used as fidget toys. Why is this so cool?
Fun Fact: Fidget spinners were first patented in 1997 and were originally designed to stop young boys from throwing rocks at police officers. When fidget spinners came back to the spotlight in 2017, many claimed that they were a toy used as a distraction tool and others found them to help with stress.
Cognitive research suggests that the act of fidgeting may be a self-regulation mechanism to help us boost or lower our attention levels (which can be either calming or energizing).
Fidget toys have even been viewed as ways to improve learning as they allow the brain to filter out extra sensory information. Allso fidgeting could provide physiological stimulation to bring our attention and energy to a level that allows our minds to better focus on the task at hand.
The most popular fidget toys you have probably seen on the market are fidget spinners, fidget cubes, and fidget rings. I would like to argue that anything can be a fidget toy... when I was a kid I used to (and still do actually) fidget with the strings on my sweaters. With that, you can totally use our Rubber Duck Data Friends as fidget toys as well 🦆
In Conclusion
You may have noticed the lack of duck facts here (yes, there’s more!). There were too many to include and I keep this post simple. If you are interested in learning more about how ducks are a data professionals best friend and how closely they are related to the world of data, check out my YouTube channel and the December Data Duck series!
Comments