Introduction to data structure and algorithm (Part 1)

Introduction

Data Structure and Algorithm(DSA) ဟာ computer science ရဲ့ foundation ပါပဲ။ သင်ဟာ ဘယ် language ပဲရွေးချယ်ရွေးချယ် ဒီ data structures တွေ၊ ဒီ algorithms တွေကနေပြေးမလွတ်ပါဘူး။ ဒါကြောင့်လဲ ဒီဘာသာရပ်ဟာ programmer or developer တယောက်အနေနဲ့ career စတော့မယ်ဆိုရင် မဖြစ်မနေလေ့လာသင့်တဲ့ ဘာသာရပ်တွေထဲကတခုဖြစ်နေတာပါ။ CS student ဆိုရင်ပိုတောင်ဆိုးပါသေးတယ် မဖြစ်မနေသိထားရမယ့် ဘာသာရပ်ပါ။

‍

What is data structure?

Data Structure ဆိုတာ data ကို efficient ဖြစ်ဖြစ် အသုံးချဖို့ ဘယ်လို သိမ်းဆည်း မလဲဆိုတဲ့ နည်းတမျိုးပါပဲ။ အလွယ်ပြောရရင်တော့ program တခုထဲမှာ data တွေကို ဘယ်လို store and organize လုပ်ပြီး၊ အဲ့ data တွေကို ဘယ်လို manipulate လုပ်မယ်ဆိုပြီး သတ်မှတ်ထားတဲ့ နည်းပေါ့။ Data Structures တွေကအများကြီးရှိပါတယ်။ တခုနဲ့တခု လုပ်ဆောင်ရတဲ့ task မတူတဲ့အလျောက်၊ တခုချင်းစီမှာ unique ဖြစ်တဲ့ characteristics တွေရှိပါတယ်။

‍

Common data structures

အောက်မှာတော့အသုံးများတဲ့ data structures တွေကို အကျဉ်းချုပ်ဖော်ပြပေးထားပါတယ်။ Programmer တယောက်လုပ်တော့မယ်ဆို အနည်းဆုံး ဒါတွေနဲ့ familiar အောင်လုပ်ထားသင့်ပါတယ်။

Array

Array ကဝောာ့ programmer တွေ တော်တော်များများနဲ့ရင်းနှီးမယ်ထင်ပါတယ်။ Fundamental အကျဆုံး data structure ဆိုလဲမမှားပါဘူး။ Array ကို data type တူတဲ့ data တွေကို တနေရာထဲမှာ sequence အလိုက် store လုပ်ဖို့အတွက် သုံးပါတယ်။ ရထားတွဲကြီးလို့သတ်မှတ်လိုက်ပေါ့။ အဲ့ရထားတွဲကြီးမှာ နဂိုကသတ်မှတ်ထားတဲ့ အခန်းအရေအတွက်ပါတယ်။ အဲ့အခန်းတွေမှာ data တွေကို store လုပ်မယ်။ အခန်းနံပါတ် တွေကို programming term နဲ့ဆိုရင်တော့ index လို့ခေါ်ပါတယ်။ တခုသတိထားရမှာက index က 0 ကနေစပါတယ်။ အခန်းတခုစီမှာပါ၀င်တဲ့ data တွေကို index နဲ့ access လုပ်လို့ရပြီး၊ array function တွေနဲ့ manipulate လုပ်လို့ရပါတယ်။ Array function တွေကလဲအများကြီးပါပဲ။ Push, pop, reverse, length, forEach စသဖြင့် အဲ့ function တွေကအသုံးများပါတယ်။ Array တွေဟာ ပုံမှန် one dimensional array အပြင် multidimensional arrays တွေလဲရှိပါတယ်။ ဒါပေမယ့် two dimensional လောက်ပဲသုံးတာများပါတယ်။

String

String ကိုဝောာ့ မသိတဲ့ programmer မရှိလောက်ပါဘူး။ String ဟာ သူရဲ့ characters တွေကို sequence အလိုက်သိမ်းဆည်းတဲ့ structure အရ data structure တွေထဲကတခုအနေနဲ့သတ်မှတ်ပါတယ်။ အလွယ်ပြောရရင်တော့ string က one dimensional array ပါပဲ။

Linked list

Linked list ဟာ array လို linear data structure တခုပါပဲ။ ဒါပေမယ့် array နဲ့မတူတာက linked list ဟာ dynamic ဖြစ်ပါတယ်။ နောက်ပြီး linked list မှာ nodes တွေလဲပါပါတယ်။ node တခုချင်းစီထဲမှာ data နဲ့ pointer ပါပါတယ်။ အဲ့ pointer နဲ့ သူ့အနောက်က node ကို link လုပ်ပါတယ်။Singly Linked ListSingly linked list တွေမှာဆိုရင် node တခုစီမှာ pointer တခုပါ၀င်ပြီး နောက်က node ကို link လုပ်ထားပါတယ်။ အဲ့လို link လုပ်ခြင်းအားဖြင့် uni-directional chain ဖြစ်လာစေပါတယ်။ ဒါကြောင့်မို့ singly linked list တွေဟာ direction တခုပဲ traverse လုပ်နိုင်ပါတယ်။Doubly Linked ListDoubly linked list တွေမှာဆိုရင် node တခုစီမှာ pointers နှစ်ခုပါပါတယ်။ Pointer တခုက အရှေ့က node ကို link လုပ်ပြီး၊ ကျန် pointer က အနောက်က node ကို link လုပ်ခြင်းအားဖြင့် bi-directional chain ကိုဖြစ်ပေါ်စေပါတယ်။ ဆိုလိုချင်တာက doubly linked list မှာ nodes တွေက အရှေ့ကော၊အနောက်ကောချိတ်ဆက်ထားတဲ့အတွက် direction နှစ်ခုလုံး traverse လုပ်လို့ရပါတယ်။Circular Linked ListCircular linked list ဟာဆိုရင် doubly linked list လိုပါပဲ။ မတူတာကတော့ circular linked list မှာ နောက်ဆုံး node(tail) ဟာ ရှေ့ဆုံး node(head) နဲ့ link လုပ်ထားခြင်းဖြင့် closed loop တခုဖြစ်လာပါတယ်။Linked list တွေဟာ dynamic data structure တွေဖြစ်တဲ့အတွက် array ထက် efficient ပိုဖြစ်တယ်ဆိုပေမယ့် random access time မှာဆိုရင်တော့ head to tail traverse လုပ်ရတဲ့အတွက် array ထက်ပိုနှေးပါတယ်။

Queue

Queue က First-In-First-Out(𝗙𝗜𝗙𝗢) principle ကိုအသုံးပြုထားတဲ့ linear data structure တခုဖြစ်ပါတယ်။ ဆိုလိုချင်တာက head ကနေ ပထမဆုံး စထည့်လိုက်တဲ့ data က၊ tail ကနေ ပထမဆုံး ထွက်လာပါလိမ့်မယ်။ ရုပ်ရှင်ရုံမှာ ticket တန်းစီသလိုပေါ့။ အရင်ဆုံးတန်းစီတဲ့လူက အရင်ဆုံး ticket ၀ယ်ပြီး queue ကနေအရင်ဆုံးထွက်သွားမှာပါ။Queue မှာ enqueue နဲ့ dequeue ဆိုပြီး operations နှစ်ခုရှိပါတယ်။ Tail ကနေ data ထည့်တာကို enqueue လို့ခေါ်ပြီး head ကနေ data ထုတ်တာကို dequeue လို့ခေါ်ပါတယ်။ Web development အပိုင်းမှာဆိုရင်တော့ queue ကို server ထဲမှာ requests တွေ handling လုပ်တဲ့နေရာမှာသုံးပါတယ်။

Stack

Stack ကတော့ Last-In-First-Out(𝗟𝗜𝗙𝗢) principle ကိုသုံးတဲ့ linear data structure ပါ။ ဆိုလိုချင်တာက top ကနေ နောက်ဆုံးထည့်လိုက်တဲ့ data က အရင်ပြန်ထွက်လာမှာပါ။ တဖက်ပိတ်စလင်ဒါ တခုထဲမှာ ပန်းကန်တွေဆင့်ပြီး သိမ်းသလိုပါပဲ။ နောက်ဆုံးမှထည့်လိုက်တဲ့ကောင်က ထိပ်ဆုံးမှာရှိနေမှာဖြစ်တဲ့အတွက် ပြန်ထုတ်ရင် အဲ့ကောင်ကနေပဲ ပြန်ထုတ်ရမှာပါ။Stack မှာလဲ push နဲ့ pop ဆိုပြီး operations နှစ်ခုရှိပါတယ်။ Top ကနေ data ထည့်လိုက်တာကို push လို့ခေါ်ပြီး top ကနေ data ပြန်ထုတ်တာကို pop လို့ခေါ်ပါတယ်။Note: Queue ကော၊ stack ကော array or linked list သုံးပြီး implement လုပ်လို့ရပါတယ်။

Tree

Tree ဟာ hierarchical data structure အသုံးပြုထားတဲ့ data structure တခုပါ။ Tree data structure ကို nodes တွေ တခုနဲ့တခု ချိတ်ဆက်ပြီး တည်ဆောက်ထားတာဖြစ်ပါတယ်။ သစ်ကိုင်းတွေကိုပြေးမြင်လိုက်ပါ။ ပထမဆုံး root node ကနေ nodes နှစ်ခုထွက်လာမယ်၊ အဲ့ nodes နှစ်ခုကနေတဆင့် သစ်ကိုင်းတွေဖြာထွက်သလိုပုံစံနဲ့ တခုနဲ့ တခု ချိတ်ဆက်ပြီး tree structure ဖြစ်လာတာပါ။ Hierarchical data structure ဖြစ်တဲ့အတွက် root node ကြီးကလွဲရင် ကျန်တဲ့ node တွေမှာ parent-child relationship ရှိပါတယ်။Tree data structure မှာတွေ့ရတတ်တဲ့ node တွေ‌ကတော့

- Root Node

- Parent Node

- Child Node

- Sibling Node

- Leaf Node

Internal Nodeစတာတွေပဲဖြစ်ပါတယ်။

Heap

Heap က specialized လုပ်ထားတဲ့ tree-based data structure တခုဖြစ်ပါတယ်။ Binary heap အနေနဲ့အတွေ့များပါတယ်။ Heap မှာ min-heap နဲ့ max-heap ဆိုပြီးနှစ်မျိုးရှိပါတယ်။Min-heap မှာဆိုရင် parent node ဟာ child nodes တွေထက် value ပိုနည်းပြီး၊ max-heap မှာဆိုရင်တော့ parent node ဟာ child nodes တွေထက် value ပိုများပါတယ်။ ရှင်းအောင်ပြောရရင်တော့ tree data structure လိုပါပဲ၊ ဒါပေမယ့် min-heap or max-heap ပေါ်မူတည်ပြီး root ကနေ branches တွေအထိ values တွေဟာ ငယ်စဉ်ကြီးလိုက် or ကြီးစဉ်ငယ်လိုက်ကွဲသွားပါလိမ့်မယ်။

Hash table

Hash table(Hash map) ဆိုတာ key-value pairing ကိုသုံးပြီး store လုပ်တဲ့ data structure တခုဖြစ်ပါတယ်။ မျက်လုံးထဲမြင်အောင်ပြောရရင်တော့ keys တွေရယ်၊ hash table ရယ်ရှိမယ်၊ hash table ထဲမှာ values တွေနဲ့ အဲ့ values တွေရဲ့ index တွေပါမယ်။ Values တွေကို access လုပ်ဖို့ အဲ့ values တွေရဲ့ index တွေနဲ့၊ သက်ဆိုင်တဲ့ key တခုချင်းစီကို hash function သုံးပြီး map လုပ်ထားတယ်။ အဲ့တာကြောင့်လဲ key-value pairing လို့ခေါ်တာပါ။

Graph

Graph data structure ဟာ nodes တွေ edges တွေနဲ့ တည်ဆောက်ထားတဲ့ non-linear data structure တခုဖြစ်ပါတယ်။ Nodes တွေကို vertices တွေလို့လဲခေါ်ပြီး edges တွေကတော့ အဲ့ nodes တွေကို တခုခုနဲ့ချိတ်ဆက်ပေးထားတဲ့ line တွေပဲဖြစ်ပါတယ်။ Graph တွေကို data points တွေကြားထဲက connections နဲ့ relationships တွေကိုဖော်ပြဖို့သုံးပါတယ်။Graph မှာလဲ directed graph, indirected graph, weighted graph, acyclic graph စသည်ဖြင့် အမျိုးအစားအများကြီးရှိပါသေးတယ်။ Graph တွေဟာ complex ဖြစ်တဲ့ problems တွေကို solve လုပ်ဖို့အလွန်အရေးပါပါတယ်။ Social networks(e.g. Facebook)၊ recommendation systems၊ နောက်ဆုံး search engines တွေကအစ graph data structure ကိုအသုံးပြုထားတာပါ။

‍

Summary

Part 1 အနေနဲ့ကတော့ ဒီလောက်ပါပဲ။ ဒီလောက်ဆိုရင် data structures တွေအကြောင်း အပေါ်ယံ cover ဖြစ်ပါပြီ။ အပေါ်က ဖော်ပြခဲ့တဲ့ data structures တွေဟာ အသုံးအများဆုံး နဲ့ အရေးအပါဆုံး data structures တွေပါပဲ။ ဒီ data structures တွေဟာ ကျွန်တော်တို့ day-to-day အသုံးပြုနေတဲ့ software တွေ၊ application တွေထဲမှာပါ၀င်တဲ့ algorithms တွေကိုဖန်တီးဖို့ အဓိကကျတဲ့အခန်းကဏ္ဍ ကနေပါ၀င်ပါတယ်။ ဒီ data structures တွေကိုနားလည်ထားတာက ကိုယ့် career မှာ တစိတ်တပိုင်းက‌နေအရေးပါလာမှာဖြစ်တဲ့အတွက် programmer တယောက်အနေနဲ့ ရင်းနှီးထားသင့်ပါတယ်။