Skip to main content

Big Data Analytics and Data Mining

KMUTNB
Enrollment is Closed

About This Course

This course comprehensively explores the principles, technologies, and applications essential to mastering big data and machine learning. Designed for students and professionals aiming to excel in data-driven industries, the curriculum spans foundational topics in big data infrastructure and ecosystems, including Hadoop, HDFS, MapReduce, HIVE, and Spark. It focuses on programming with PySpark for distributed data processing. The course integrates database concepts with practical applications, emphasizing SQL and NoSQL for managing structured and unstructured data.

Participants will delve into data mining techniques, exploring the end-to-end process of extracting valuable insights from vast datasets. The course emphasizes critical concepts like data normalization, dimensional reduction, and data visualization, equipping learners with tools to prepare and interpret complex datasets effectively. Machine learning is central to the curriculum, covering supervised and unsupervised learning techniques. Key algorithms, such as decision trees, artificial neural networks, and K-means clustering, will be studied and applied to solve practical problems.

The course also highlights advanced methods for model evaluation, ensuring participants can assess and improve the performance of machine learning models. Through hands-on programming exercises, learners will develop expertise in Python and other relevant data exploration and analysis tools.

A distinctive feature of the course is its emphasis on real-world applications through case studies. Participants will apply their knowledge to domains such as business analytics, healthcare, and industrial processes, gaining valuable experience in crafting data-driven solutions. By the end of the course, learners will have a solid grasp of the technologies, programming skills, and analytical techniques necessary for a successful career in big data analytics and machine learning.

This course is ideal for students, data enthusiasts, and professionals seeking to enhance their understanding of big data ecosystems and leverage machine learning to address complex challenges in today’s data-centric world.

Requirements

Familiarity with Python or other programming languages is highly recommended, as the course involves significant hands-on coding exercises, especially with Spark, PySpark, and data mining algorithms.

Course Staff

Phayung Meesad

Associate Professor Dr. Phayung Meesad is a distinguished academic and researcher in Computer Science and Engineering. He currently teaches at the Faculty of Information Technology and Digital Innovation, King Mongkut's University of Technology North Bangkok (KMUTNB).

Academic Background

Dr. Meesad completed his doctoral studies in Electrical Engineering at Oklahoma State University, USA, in 2002. Before this, he obtained his Master of Science in Electrical Engineering from the same institution in 1998. His foundational education includes a Bachelor's degree in Technical Education from King Mongkut's Institute of Technology North Bangkok.

Research Interests

His research focuses on several key areas:

  • Artificial Intelligence
  • Business Intelligence
  • Computational Intelligence
  • Data Analytics
  • Data Mining
  • Fuzzy Systems
  • Image Processing
  • Machine Learning
  • Natural Language Processing
  • Stock Price Prediction
  • Time Series Prediction

Professional Activities

At KMUTNB, he actively participates in:

  • Teaching graduate and undergraduate courses
  • Supervising research students
  • Contributing to academic publications
  • Leading research projects
  • Collaborating with international researchers

His work continues to influence computer science and engineering, particularly in Thailand and the broader Asian academic community.

Other FAQ

1. Who is this course designed for?
This course is ideal for students, data enthusiasts, and professionals seeking to deepen their knowledge of big data infrastructure, machine learning, and data analytics. It caters to those interested in pursuing careers in data science, business analytics, or related fields.

2. What are the prerequisites for enrolling in this course?
Basic programming skills (preferably in Python), a foundational understanding of mathematics and statistics, and familiarity with SQL are recommended. Prior exposure to data analytics or machine learning concepts is helpful but not mandatory.

3. Do I need prior experience with big data technologies like Hadoop or Spark?
No prior experience with big data tools is required. The course begins with foundational concepts and progresses to more advanced topics, ensuring all learners can follow along.

4. What programming languages and tools will be used in this course?
The course primarily uses Python and PySpark for programming and data analysis. It also covers tools for managing and analyzing data, such as Hadoop, HIVE, SQL, and NoSQL databases.

5. Will there be hands-on projects or assignments?
Yes, the course includes practical assignments and real-world case studies to help you apply the concepts and tools covered in the curriculum. These projects span various domains, such as business analytics, healthcare, and industrial applications.

6. How is the course delivered?
Depending on the institution or platform, the course can be delivered online, in person, or as a hybrid program. It includes a mix of lectures, hands-on labs, and project work.

7. How long does the course take to complete?
The duration of the course depends on the institution or platform offering it. Typically, it ranges from 8 to 16 weeks, depending on the depth and intensity of the material.

8. What kind of certification will I receive upon completion?
Depending on the course provider, participants will receive a certificate of completion or achievement. Certifications may highlight proficiency in big data tools, programming, and machine learning.

9. Will this course prepare me for a career in data analytics or data science?
Yes, this course equips you with in-demand skills like data mining, big data processing, machine learning, and data visualization, which are critical for roles in data analytics, data science, and related fields.

10. What support is available for participants during the course?
Participants will have access to course materials, discussion forums, instructor support, and additional resources for troubleshooting and further learning.

11. What types of datasets will we work with?
The course uses a variety of datasets, including structured, unstructured, and semi-structured data, to provide hands-on experience with real-world data challenges.

12. Are there any assessments or exams in the course?
The course includes quizzes, assignments, and project evaluations to assess your understanding and application of the concepts.

13. Can I take this course while working full-time?
Yes, the course is designed to accommodate working professionals. The flexible schedule allows participants to learn independently, especially in online or part-time formats.

14. How does this course differ from other data analytics or machine learning courses?
This course uniquely integrates big data infrastructure, programming, and machine learning with hands-on projects, focusing on theoretical understanding and practical applications across diverse industries.

15. Can I access course materials after completing the course?
Access policies depend on the course provider. Many institutions offer lifetime or extended access to course materials for reference after completion.