What is Data Mining?

Interested in the intricate world of Data Mining? In Episode 2 of ‘Talk Tech with Data Dave’, the expert Dave, alongside Alexis Keller-Carrell, offers a comprehensive look into this profound subject. They address the burning question: “What exactly is Data Mining?” Dive deep into transformative insights that illuminate the profound role of data in shaping our day-to-day experiences.
With new episodes launching on the second Tuesday of every month, our podcast consistently shines a spotlight on the awe-inspiring facets of technology and data. Ensure you’re at the forefront of tech innovation by subscribing to ‘Talk Tech with Data Dave’. Click on the provided link now, and immerse yourself in an unparalleled voyage of knowledge, innovation, and technological exploration.

Ready to be on the cutting edge of tech knowledge? Subscribe to ‘Talk Tech with Data Dave’ today and don’t miss our monthly deep dives into the world of technology and data. Don’t forget to send your questions to talktech@d3clarity.com and stay tuned for more enlightening discussions.

HAVE A QUESTION?
Ask Data Dave about all things data, cloud, or technology.
We'll be happy to answer your question on the podcast.

or send us an email to: techtalk@d3clarity.com

Published:

September 22, 2023

Duration:

00:14:42

]

Transcript

Alexis
Hi everyone, welcome to Talk Tech with Data Dave! I’m Alexis, and I am the operations associate at D3Clarity here with Data Dave to answer all of my data questions.

Data Dave
Good morning. Good morning. Alexis, I’m Dave, Dave Wilkinson from D3Clarity and I’m the Chief Technology Officer. I’m one of the founders of D3Clarity, and we’re a small organization focused on providing data engineering high-quality data, data management, and next-generation infrastructure for organizations that are looking for transformation into a digital age.

Alexis
I’m a super non-technical member of this staff, so my favorite part is to sit here and ask Dave all of my burning data questions and then hopefully soon ask him some of your burning data questions.

So, today we’re going to talk about data mining because Dave brought that up last week when we were just talking about data in general. So, Dave, here’s the question, what is data mining?

Data Dave
Good question, Alexis. Data mining is a term that was coined in the 1960s to really describe the exploration in data for interesting information and knowledge from data. People have also used data phishing, snooping, experimentation, et cetera, and it comes from the basic idea that in this mass of ore, think of a mountain.
That is your data. A large mountain of data. There is some interesting nuggets of information, right? There’s precious metals in them there hills, right? So, we’re mining for those precious metals in this mountain of data.

So, in the transformation, you’re thinking about how data leads to information that leads to knowledge. From that perspective, what you’re really looking for, and as we look into an analytic and digital age, we’re looking for answers to interesting questions in the masses of data that we collect.
What are the nuggets of precious metals that are hidden somewhere in them there hills of data that we can extract by mining for it, digging for it, and exploring, et cetera in the state. It’s a it’s a euphemism, right? Using the mining analogy from digging in the mountains and prospecting gold, et cetera. To start to look at data and analytics for interesting knowledge.

Alexis
That’s crazy. The 1960s. Like in my mind, data mining is something that happened with the Internet age, so that’s baffling to me that it was a thing in the 60s.

Data Dave
Well, it goes back to data warehousing and databases, and from the point of collecting data. I mean you can do this one on any data set. This data mining is really just the exploration and looking for answers in data. So, while our volume of data and advanced analytics and AI and machine learning and so on plays into this and helps make it easier and gives us more techniques for doing it, the concept of looking for answers in data is actually pretty old. It goes back, like I said, to the 1960s and the early inventions of computers and doing, you know, business intelligence and just reporting.

Alexis
Yeah, that that makes sense. That’s an ignorance on my own part. I was just, like, fascinated when you said that. Okay, so, one of the things that you said was the idea of, like, extracting knowledge or evaluating knowledge to understand data more. Can you talk to me a little bit more about that?

Data Dave
So yes. Data in and of itself is just a raw commodity, right? You can collect data and you can collect sales data, weather data, all kinds of data you can collect, right? But that doesn’t give you knowledge. Knowledge is essentially information with purpose, right? So, you have knowledge you know something, you know something not because you’ve got a collection of data, but you’ve inferred something more from that data. I can collect the miles on my car, right? My car collects data to say how many miles I’ve driven and that sort of thing. But that doesn’t give me the knowledge to know that I’ve got to change the tires.

I say that cause I’ve got an appointment to change my tires later today, but there you go.  The car collects the data but the knowledge to know that my tires are getting worn is knowledge that can be inferred from that data. Does that make sense? So, you collect the data, you use that to have information on the tires and then you end up with the knowledge to add that you can actually make a decision on.

Alexis
Yeah, that tracks.

Data Dave
And take action on. You don’t necessarily take action on data. You take action on the knowledge that’s inferred from the data.

Alexis
So, my brain of thinking that data mining is really kind of a big business, big Internet thing is true, but also that data mining is something that we kind of do every day. But we do it on our own.

Data Dave
Yeah, exactly. I mean, if you think about it when you’re shopping, just going shopping on Amazon, you’re exploring a massive cadre of data. That is the whole Amazon catalog of products that’s available to find the one that you want and then you make a decision based off it and say, “Okay, I’m gonna buy this.”

Alexis
Yeah. Okay, that all tracks for just like a personal use, but what about a business like what’s a business implication of data mining or a business use case for data mining?

Data Dave
A business use case. There’s anything that you don’t know and there’s sort of standard data mining. Data mining is usually more the exploration of new information and new knowledge than necessarily just predictive analytics or straight analytics. So, you could start to say things like sales data or sales patterns, right? Could be considered just analytics, right? What are the patterns in the Southeast United States for sales of warm coats in months, right? That’s usually straightforward analytics.

But what if we were to start to say, can I correlate that? This is now an exploration statement. Can I correlate that with weather patterns? Can I correlate that to hurricane season? But that might be an that’s an interesting question. So, we had an insurance patent for risk.  How does weather play into the insurance plans for homeowner insurance? Can we correlate that for risk statistics, proximity to high schools for girls’ fashion? How much? How much does how much? What are the sales patterns based off of proximity to high schools for Target stores? Does this correlate? Does this give us some interesting information that we could make some interesting decisions based off of to restock differently, treat our inventory differently, add extra promotions, or make some other decisions? So, it’s the idea of presenting new and interesting questions.

And then mining for that answer. So, think of it as a question and answer. I’ve got a large cadre of data, and I’ve got this hypothesis, this interesting question that you’re posing that says, “How could I sell? What are the dimensions that might allow me to sell more? Do something differently or whatever it might be?” And Can I pose that question and mine data for those nuggets, those precious metals, that will give us that answer. That will say, “Yes in California! Yeah. Proximity to high school is relevant in West Virginia. It is less relevant….” I don’t know. I don’t know the answer, but you can see where I can go in terms of asking some of these questions to start to change the patterns and be more responsive and more directive if you like, in the way that you take something to market.

Alexis
At D3C, we use the words data governance and master data management a lot. Do those things kind of correlate with data mining?

Data Dave
Yes, they do. I mean where we spend a lot of focus is having confidence in your data, right? So, we’ve talked about the idea of having a large mountain of data and extracting those precious metals, those precious nuggets from that mountain, the mining concept, that’s what we talked about. But how do you have confidence that that is the right metal that you found? How do you know that you can have confidence in that answer? Then you get into things like master data and start to say, “Okay, well, my customer base, the individuals who have purchased from me is actually a very, very relevant set of data within my mountain, right?” So, whenever I’m talking about customs, I want customers always to be represented the same.

Always, I want to have confidence in that customer so that whenever I’m joining or working my data around a customer, I can have confidence that when I’m talking about Alexis, I’m always talking about the same Alexis, right? So that and I can then talk about the way that you’ve purchased things or other things in the same way. And that’s where we spend a lot of time on master data. And getting that data correct so that when you mine for that answer, you can have confidence in that answer. Because if you can’t have confidence in your answer, then your question is largely irrelevant. You can’t act on it. It’s going back to that knowledge. Do I know or do I just think?  And then, we get on to data governance and data governance is again, it’s that confidence in that data as I’m collecting data, how am I governing the way my business is operating, the way my data is operating and flowing so that the data that arrives in my card or in my data warehouse or in my data lake or in my data environment. I’ve got confidence that it truly describes what it says it describes.

Alexis
So, if it was a flow chart, if you will, it would be like data governance. Make sure that you’re inputting the right information. Master Data management is making sure that the information is correct and data mining is extracting that information for like awesome stuff, which is that the non-technical kind of verbiage of that.

Data Dave
Yeah, exactly, exactly right. So, you’ve got the data governance is all the data in my organization is flowing into my data environment and into my data landscape if you like and does it describe what it’s what I say it describes? Is it always similar? Is it always correct? Is it populated enough? Is it the right level of quality? How do I govern that and make sure it describes what it says, and then the master data management says those key dimensions, those key areas of data, customer, people, places, and things. So, all my people are always described the same way. All my places are always described the same way. Do they all have valid addresses on them?  Are my things, products, services, whatever they might be described in the same way and described accurately, and they form the basis of your master data structures, right? People, places, things. Customers are people, products are things and then places are deliveries or stores or whatever it might be.

Alexis
Yeah, so definitely sounds like we are here to support people on their efforts to making sure that their information is correct and it’s able to be used if data mining is what they’re hoping to do, would you say that’s correct?

Data Dave
That’s right. Or I mean, now data mining is less used more analytics, right? So, if you think about the analytics and the exploration of data and machine learning, that goes down the same path. So yes, we spend a lot of time in the data and understanding that data and understanding the questions that are being asked. And understanding how do you refine an answer from the question and from the data that you have?

Alexis
I love that. How do you define an answer from the data that you have that sounds like a perfect place to end this podcast data, Dave, I love it. Feels like we have a lot of potential topics for next month’s recording here. I’ve got a lot more questions, but we are running out of recording time. So, thank you so much for talking with me today and answering my questions about data mining.

Data Dave
Thanks. Thank you. Thank you always.

Alexis
Thanks everyone, have a good day.

Data Dave
Thank you. Bye bye.

Ask Data Dave!

Listener questions are the best.
Ask Data Dave any question you have about all things data, all things cloud, or all things technology.
We'll be happy to answer your question on the podcast.

We will never sell, share or misuse your personal information.

Let's Talk.

An expert, not a sales person, will contact you quickly.
Usually in less than 20 minutes during business hours.

We will never sell, share or misuse your personal information.

Schedule a free meeting with an Expert.