Skip to main content

Big Data Analytics and Data Mining

KMUTNB

About This Course

This course comprehensively explores the principles, technologies, and applications essential to mastering big data and machine learning. Designed for students and professionals aiming to excel in data-driven industries, the curriculum spans foundational topics in big data infrastructure and ecosystems, including Hadoop, HDFS, MapReduce, HIVE, and Spark. It focuses on programming with PySpark for distributed data processing. The course integrates database concepts with practical applications, emphasizing SQL and NoSQL for managing structured and unstructured data.

Participants will delve into data mining techniques, exploring the end-to-end process of extracting valuable insights from vast datasets. The course emphasizes critical concepts like data normalization, dimensional reduction, and data visualization, equipping learners with tools to prepare and interpret complex datasets effectively. Machine learning is central to the curriculum, covering supervised and unsupervised learning techniques. Key algorithms, such as decision trees, artificial neural networks, and K-means clustering, will be studied and applied to solve practical problems.

The course also highlights advanced methods for model evaluation, ensuring participants can assess and improve the performance of machine learning models. Through hands-on programming exercises, learners will develop expertise in Python and other relevant data exploration and analysis tools.

A distinctive feature of the course is its emphasis on real-world applications through case studies. Participants will apply their knowledge to domains such as business analytics, healthcare, and industrial processes, gaining valuable experience in crafting data-driven solutions. By the end of the course, learners will have a solid grasp of the technologies, programming skills, and analytical techniques necessary for a successful career in big data analytics and machine learning.

This course is ideal for students, data enthusiasts, and professionals seeking to enhance their understanding of big data ecosystems and leverage machine learning to address complex challenges in today’s data-centric world.

Requirements

Familiarity with Python or other programming languages is highly recommended, as the course involves significant hands-on coding exercises, especially with Spark, PySpark, and data mining algorithms.

Course Staff

Phayung Meesad

Associate Professor Dr. Phayung Meesad is a distinguished academic and researcher in computer science and engineering. He teaches at the Faculty of Information Technology and Digital Innovation, King Mongkut's University of Technology North Bangkok (KMUTNB).

Academic Background

Dr. Meesad completed his doctoral studies in Electrical Engineering at Oklahoma State University, USA, in 2002. Before this, he obtained his Master of Science in Electrical Engineering from the same institution in 1998. His foundational education includes a Bachelor's in Technical Education (Teacher Training in Electrical Engineering) from King Mongkut's Institute of Technology North Bangkok.

Research Interests

His research focuses on several key areas:

  • Artificial Intelligence
  • Business Intelligence
  • Computational Intelligence
  • Data Analytics
  • Data Mining
  • Fuzzy Systems
  • Image Processing
  • Machine Learning
  • Natural Language Processing
  • Stock Price Prediction
  • Time Series Prediction

Professional Activities

At KMUTNB, he actively participates in:

  • Teaching graduate and undergraduate courses
  • Supervising research students
  • Contributing to academic publications
  • Leading research projects

Other FAQ

1. Who is this course designed for?
This course is ideal for students, data enthusiasts, and professionals seeking to deepen their knowledge of big data infrastructure, machine learning, and data analytics. It caters to those interested in pursuing careers in data science, business analytics, or related fields.

2. What are the prerequisites for enrolling in this course?
Basic programming skills (preferably in Python), a foundational understanding of mathematics and statistics, and familiarity with SQL are recommended. Prior exposure to data analytics or machine learning concepts is helpful but not mandatory.

3. Do I need prior experience with big data technologies like Hadoop or Spark?
No prior experience with big data tools is required. The course begins with foundational concepts and progresses to more advanced topics, ensuring all learners can follow along.

4. What programming languages and tools will be used in this course?
The course primarily uses Python and PySpark for programming and data analysis. It also covers tools for managing and analyzing data, such as Hadoop, HIVE, SQL, and NoSQL databases.

5. Will there be hands-on projects or assignments?
Yes, the course includes practical assignments and real-world case studies to help you apply the concepts and tools covered in the curriculum. These projects span various domains, such as business analytics, healthcare, and industrial applications.

Enroll