How Does AI Play Games? Unlocking the Secrets of Game Trees!

Hello there! How's everything going? Today, we are going to dive into a topic in Artificial Intelligence (AI) that instantly brings video game screens to mind! Have you ever wondered—how exactly does AI play games? Whether it's Chess or Ludo, how does it figure out its next move? Grab a notebook and let's unravel this mystery together like a fun story!

Why Do We Even Study Games in AI?

You might wonder why scientists spend so much time on games instead of 'serious' stuff. The reasons are actually super interesting!

Symbol of Intelligence: Since ancient times, playing games has been considered a huge sign of intelligence. If you're good at games, you're smart!
Simple Formulas: Real-world problems are messy and complicated. But game rules are strictly defined, making them much easier to formalize.
Models of Real Life: Many real-life competitive or cooperative tasks—like business negotiations, auctions, or even military strategies—function almost exactly like games!

The Origins of Game AI

Game AI isn't just a modern trend; it has a history spanning over a century:

1912: Ernst Zermelo first started working on the Minimax algorithm.
1949: Claude Shannon wrote a famous paper on evaluation functions and search techniques for Chess.
1956: John McCarthy introduced Alpha-beta search. That same year, Arthur Samuel built a checkers program that learned from its own mistakes by playing against itself! (The dawn of Machine Learning!)

What Do Game Environments Look Like?

Not all game worlds are the same. We can divide them into a few key categories:

Deterministic vs. Stochastic: In games like Chess, there's no room for luck; everything depends on your specific moves (Deterministic). But in Ludo or Monopoly? The roll of the dice brings luck into play (Stochastic)!
Perfect vs. Imperfect Information: In Chess or Tic-Tac-Toe, both players can see the entire board (Perfect Information). But in Poker or Bridge, you can't see your opponent's cards—there are hidden secrets (Imperfect Information).

Alternating Two-Player Zero-Sum Games

We are mainly talking about games where two players take turns. And the fun catch? One player's gain is exactly equal to the other player's loss! This is what we call a Zero-sum game.

How it works: The end of the game is called a Terminal State, with a score attached (+1 for a win, -1 for a loss, 0 for a draw).
Why it matters: The absolute best outcome for me is the absolute worst for my opponent! So, the AI must play in a way that makes it impossible for the opponent to win.

Game Trees: The Map of Possibilities!

What is it? A Game Tree is like a map or a directed graph that visualizes every possible state and move in a game.
How it works: Each node represents a specific state of the board, and the connections (edges) show the legal moves. The very bottom nodes are Terminal nodes—where you find out if you won, lost, or drew! Think of Tic-Tac-Toe. The board starts empty. You (MAX) make a move, then your friend (MIN) moves. Layer by layer, the tree grows!

Minimax Search Algorithm (Finding the Best in the Worst Case!)

What is it? It's a decision-making algorithm where you are MAX (trying to get the highest score) and the opponent is MIN (trying to give you the lowest score).
How it works: The algorithm goes all the way down to the bottom of the tree, looks at the final scores, and backtracks to make a decision. MAX always picks the highest value from its children, while MIN always picks the lowest.
The Magic: It assumes your opponent is a genius who will play perfectly. So, Minimax finds the move that guarantees the 'best worst-case scenario' for you!

Alpha-Beta Pruning (Trimming the Useless Branches!)

What is it? Minimax is great, but checking every single node takes forever. Enter its optimized version: Alpha-Beta Pruning.
How it works: Instead of exploring the whole tree, it simply cuts off (prunes) branches that won't affect the final decision. It uses two parameters: Alpha (MAX's best choice so far) and Beta (MIN's best choice so far).
The Trick: If it realizes exploring a node is pointless (when Alpha $\ge$ Beta), it skips all its children. The result is exactly as perfect as Minimax, but way faster!

Monte Carlo Tree Search (MCTS) - The Brain Behind AlphaGo!

When a game tree is unimaginably huge (like in the game of Go), Minimax and Alpha-beta tap out. That's when MCTS enters like a superhero! It works in 4 repeating phases:

Selection: Uses a formula called UCB (Upper Confidence Bound) to pick the most promising child node.
Expansion: Goes one level deeper from the selected node to create new nodes.
Simulation: Plays the game out entirely at random from the new node to see if it leads to a win or loss!
Backpropagation: Takes the result of that simulation (like +1 for a win) and passes it all the way back up the tree to the root.

Pros: No complex formulas needed—just knowing the game rules and doing random simulations works wonders! It also focuses only on the most promising paths, saving time and memory. Cons: If there is a highly specific, subtle trap move, the random simulations might just miss it!

Mind-blowing, right? Knowing the massive math and algorithms running behind the scenes makes playing games even more exciting! I hope the magical world of Game AI feels much clearer to you now. Let me know if this sparked your curiosity to learn more!

হ্যালো! কী অবস্থা সবার? আজকে আমরা আর্টিফিশিয়াল ইন্টেলিজেন্স বা AI-এর এমন একটা টপিক নিয়ে আড্ডা দেবো, যেটা শুনলেই চোখের সামনে ভিডিও গেমের স্ক্রিন ভেসে ওঠে! আচ্ছা, কখনো কি ভেবে দেখেছ—AI কীভাবে গেম খেলে? দাবা বা লুডু খেলার সময় সে কীভাবে বোঝে যে এরপর কোন চালটা দিতে হবে? চলো, খাতা-কলম নিয়ে রেডি হয়ে যাও, একদম গল্পের ছলে আজ এই রহস্যটা ফাঁস করব!

কেন আমরা AI-তে গেম নিয়ে মাথা ঘামাই?

ভাবতে পারো, বিজ্ঞানীরা এত সিরিয়াস কাজ বাদ দিয়ে গেম নিয়ে কেন পড়ে আছেন? কারণগুলো কিন্তু বেশ ইন্টারেস্টিং!

বুদ্ধিমত্তার প্রতীক: সেই প্রাচীনকাল থেকেই গেম খেলাকে বুদ্ধির একটা বড় লক্ষণ ধরা হয়। যে ভালো গেম খেলে, সে বুদ্ধিমান—এমন একটা ধারণা তো আছেই!
সহজ ফর্মুলা: বাস্তব দুনিয়ার সমস্যাগুলো অনেক জট পাকানো। কিন্তু গেমের নিয়মকানুনগুলো একদম ছকে বাঁধা, তাই এগুলোকে ফর্মুলায় ফেলা সহজ।
বাস্তব জীবনের মডেল: আমাদের রিয়েল-লাইফের অনেক কাজ—যেমন বিজনেস নেগোসিয়েশন, নিলাম বা ধরো মিলিটারি যুদ্ধ—সবগুলোই কিন্তু অনেকটা গেমের মতোই কাজ করে!

গেম AI-এর ইতিকথা

গেম AI কিন্তু আজকের কোনো নতুন ট্রেন্ড নয়! এর পেছনে আছে এক শতাব্দীরও পুরোনো ইতিহাস:

১৯১২ সাল: আর্নস্ট জারমেলো প্রথম Minimax অ্যালগরিদম নিয়ে কাজ শুরু করেন।
১৯৪৯ সাল: ক্লড শ্যানন দাবার জন্য ইভ্যালুয়েশন ফাংশন ও সার্চ টেকনিক নিয়ে এক দারুণ পেপার লেখেন।
১৯৫৬ সাল: জন ম্যাককার্থি নিয়ে আসেন Alpha-beta search। আর একই বছর আর্থার স্যামুয়েল এমন একটা চেকার্স প্রোগ্রাম বানান, যেটা নিজে নিজেই খেলে নিজের ভুল থেকে শিখতে পারত! (ভাবা যায়, মেশিন লার্নিংয়ের একদম আদি রূপ!)

গেমের দুনিয়াটা কেমন হয়?

সব গেমের পরিবেশ কিন্তু এক রকম নয়। গেমের দুনিয়াকে মূলত দুটি ভাগে ভাগ করা যায়:

Deterministic (সবকিছু নির্দিষ্ট) vs Stochastic (ভাগ্য বা অনিশ্চয়তা): দাবার মতো খেলায় ভাগ্যের কোনো জায়গা নেই, তোমার চালের ওপরই সব নির্ভর করে (Deterministic)। কিন্তু লুডু বা মনোপলিতে? ওই যে ছক্কার চাল—ভাগ্যের ওপর অনেক কিছু নির্ভর করে (Stochastic)!
Perfect vs Imperfect information: দাবা বা টিক-ট্যাক-টো খেলায় বোর্ডের পুরো অবস্থা দুজন খেলোয়াড়ই দেখতে পায় (Perfect information)। কিন্তু পোকার বা ব্রিজ খেলায় প্রতিপক্ষের কার্ড কি দেখা যায়? মোটেই না! লুকানো তথ্য থাকে (Imperfect information)।

Alternating Two-Player Zero-Sum Games (একের লাভ, অন্যের ক্ষতি!)

আমরা মূলত এমন গেম নিয়ে কথা বলব যেখানে দুজন খেলোয়াড় পর্যায়ক্রমে চাল দেয়। আর মজার ব্যাপার হলো—একজনের লাভ মানেই অন্যজনের ঠিক সমপরিমাণ ক্ষতি! একেই বলে Zero-sum game।

কীভাবে কাজ করে? গেমের শেষ অবস্থাকে বলে Terminal state। জিতলে +1, হারলে -1, ড্র হলে 0—এরকম একটা স্কোর থাকে।
কেন এটা গুরুত্বপূর্ণ? আমার জন্য যেটা সবচেয়ে বেশি লাভজনক, আমার প্রতিপক্ষের জন্য সেটাই সবচেয়ে বড় লস! তাই AI-কে এমনভাবে চাল দিতে হয় যেন প্রতিপক্ষ তাকে কোনোভাবেই হারাতে না পারে।

Game Tree: গেমের ম্যাপ!

এটা আসলে কী? গেম ট্রি হলো একটা ম্যাপ বা ডিরেক্টেড গ্রাফের মতো, যা একটা গেমের সম্ভাব্য সব অবস্থা এবং চালের প্রবাহকে দেখায়।
কীভাবে কাজ করে? এর প্রতিটি নোড (Node) গেমের একটা নির্দিষ্ট অবস্থাকে বোঝায়, আর সংযোগগুলো (Edge) দেখায় পরবর্তী বৈধ চালগুলোকে। ট্রির একদম শেষ নোডগুলোকে বলে Terminal nodes—যেখানে লেখা থাকে তুমি জিতলে, হারলে নাকি ড্র করলে! কাটাকুটি খেলার (Tic-tac-toe) কথা ভাবো। শুরুতে বোর্ড ফাঁকা। এরপর তুমি (MAX) একটা চাল দিলে, তারপর তোমার বন্ধু (MIN) চাল দিল। এভাবে ট্রি-টা বড় হতে থাকে!

Minimax Search Algorithm (সবচেয়ে খারাপ পরিস্থিতিতেও সেরা চাল!)

এটা কী? এটা হলো ডিসিশন মেকিংয়ের একটা অ্যালগরিদম। এখানে তুমি হলে MAX (যে নিজের স্কোর বাড়াতে চায়) আর প্রতিপক্ষ হলো MIN (যে তোমার স্কোর কমাতে চায়)।
কীভাবে কাজ করে? এই অ্যালগরিদমটা পুরো গেম ট্রির একদম শেষ পর্যন্ত নেমে যায়, আর নিচের ভ্যালুগুলো দেখে ওপরে উঠতে উঠতে সিদ্ধান্ত নেয়। MAX সবসময় বড় মানটা খোঁজে, আর MIN খোঁজে ছোট মানটা।
মজাটা কোথায়? এটা ধরে নেয় যে তোমার প্রতিপক্ষও একজন সাংঘাতিক চালাক খেলোয়াড় এবং সে একদম নিখুঁত চাল দেবে। তাই Minimax তোমাকে এমন একটা চাল খুঁজে দেয়, যেটা 'সবচেয়ে খারাপ পরিস্থিতিতেও সবচেয়ে ভালো ফলাফল' নিশ্চিত করে!

Alpha-Beta Pruning (অপ্রয়োজনীয় ডালপালা ছেঁটে ফেলা!)

এটা কী? Minimax তো ভালো, কিন্তু সব নোড চেক করতে গেলে অনেক সময় লাগে। তাই এর একটা অপ্টিমাইজড ভার্সন হলো Alpha-Beta Pruning।
কীভাবে কাজ করে? ট্রির সব নোড চেক না করে, যে নোডগুলো ফাইনাল সিদ্ধান্তে কোনো প্রভাব ফেলবে না, সেগুলোকে ছেঁটে (Pruning) ফেলা হয়। এর জন্য Alpha (MAX-এর সেরা চয়েস) এবং Beta (MIN-এর সেরা চয়েস) নামের দুটো প্যারামিটার ব্যবহার করা হয়।
ম্যাজিকটা কোথায়? যখনই দেখা যায় একটা নোড চেক করা অর্থহীন (শর্ত: Alpha >= Beta), তখনই তার নিচের সব ডালপালা ছেঁটে দেওয়া হয়। রেজাল্ট আসে একদম Minimax-এর মতোই নিখুঁত, কিন্তু অনেক দ্রুত!

Monte Carlo Tree Search (MCTS) - AlphaGo-এর পেছনের কারিগর!

যখন গেম ট্রি বিশাল আকার ধারণ করে (যেমন Go গেম), তখন Minimax বা Alpha-beta দিয়েও কুলানো যায় না। তখন সুপারহিরোর মতো এন্ট্রি নেয় MCTS! এটা মূলত ৪টি ধাপে কাজ করে:

Selection (নির্বাচন): UCB (Upper Confidence Bound) ফর্মুলা ব্যবহার করে সবচেয়ে সম্ভাবনাময় (Promising) চাইল্ড নোড সিলেক্ট করা হয়।
Expansion (সম্প্রসারণ): সিলেক্টেড নোড থেকে এক লেভেল গভীরে গিয়ে নতুন নোড তৈরি করা হয়।
Simulation (সিমুলেশন): নতুন নোড থেকে একদম র‍্যান্ডমলি (র‍্যান্ডম চাল দিয়ে) গেমটি শেষ পর্যন্ত খেলা হয়। দেখা হয় জয় এল নাকি পরাজয়!
Backpropagation (পশ্চাদগামিতা): সিমুলেশনের যে রেজাল্ট পাওয়া গেল (যেমন জয় পেলে +1), সেটা ওপরের সব নোড হয়ে একদম রুট পর্যন্ত আপডেট করে দেওয়া হয়।

সুবিধা: এর জন্য কোনো জটিল ফর্মুলার দরকার নেই, গেমের নিয়ম জানলেই র‍্যান্ডম সিমুলেশন করে দারুণ কাজ করতে পারে। আর শুধু ভালো পথগুলোতেই বেশি ফোকাস করে বলে সময় বাঁচে। অসুবিধা: যদি কোনো গুরুত্বপূর্ণ চাল খুব সূক্ষ্ম হয়, র‍্যান্ডম সিমুলেশনের কারণে MCTS হয়তো সেই 'ট্র্যাপ' বা ফাঁদটা মিস করে যেতে পারে!

কী দারুণ, তাই না? গেম খেলার পেছনে যে এত অঙ্ক আর অ্যালগরিদম লুকিয়ে আছে, তা জানলে গেম খেলার মজাই তো বেড়ে যায় বহুগুণ! আশা করি গেম AI-এর এই দারুণ দুনিয়াটা তোমার কাছে এখন অনেক পরিষ্কার। আরও কিছু জানার কিউরিওসিটি জাগলে কিন্তু বলতেই পারো!

How Does AI Play Games? Unlocking the Secrets of Game Trees! AI কীভাবে গেম খেলে? চলো গেম ট্রি আর ডিসিশন মেকিংয়ের গোপন রহস্য ভেদ করি!