ML 0. ํ†ต๊ณ„๋ถ„์„ | ํ†ต๊ณ„์  ์ถ”๋ก ๊ณผ ํ†ต๊ณ„์  ๊ฒ€์ •
ยท
Data Science/ML
ํ†ต๊ณ„๋ถ„์„(Statistical Analsis) 1. ํ†ต๊ณ„ํ•™๊ณผ ๊ธฐ์ˆ ํ†ต๊ณ„ 2. ํ™•๋ฅ ๋ถ„ํฌ 3. ํ†ต๊ณ„์  ์ถ”๋ก ๊ณผ ํ†ต๊ณ„์  ๊ฒ€์ • ← ์˜ค๋Š˜์€ ์—ฌ๊ธฐ! 1๋ฒˆ๊ณผ 2๋ฒˆ์„ ๋จผ์ € ๋ณด๊ณ ์‹ถ๋‹ค๋ฉด? https://heesleisure.tistory.com/28 ML 0. ํ†ต๊ณ„๋ถ„์„ ํ†ต๊ณ„๋ถ„์„(Statistical Analsis) 1. ํ†ต๊ณ„ํ•™๊ณผ ๊ธฐ์ˆ ํ†ต๊ณ„ 2. ํ™•๋ฅ ๋ถ„ํฌ 3. ํ†ต๊ณ„์  ์ถ”๋ก ๊ณผ ํ†ต๊ณ„์  ๊ฒ€์ • 1. ๊ธฐ์ˆ ํ†ต๊ณ„ ๋ฐ์ดํ„ฐ์˜ ์†์„ฑ์„ ํŠน์ •ํ•œ ํ†ต๊ณ„๋Ÿ‰์„์‚ฌ์šฉํ•ด ์ •๋ฆฌ, ์š”์•ฝ, ์„ค๋ช…ํ•˜๋Š” ๋ฐฉ๋ฒ• ์ค‘์‹ฌ์ฒ™๋„ ์ค‘์‹ฌ๊ฒฝํ–ฅ์„ฑ: heesleisure.tistory.com 3. ํ†ต๊ณ„์  ์ถ”๋ก ๊ณผ ํ†ต๊ณ„์  ๊ฒ€์ • 1) ๊ธฐ์ˆ ํ†ต๊ณ„์™€ ์ถ”๋ก ํ†ต๊ณ„ - ๊ธฐ์ˆ ํ†ต๊ณ„: ์ธก์ •์ด๋‚˜ ์‹คํ—˜์„ ํ†ตํ•ด ์ˆ˜์ง‘ํ•œ ํ†ต๊ณ„ ์ž๋ฃŒ์˜ ์ •๋ฆฌ/ํ‘œํ˜„/์š”์•ฝ/ํ•ด์„์„ ํ†ตํ•˜์—ฌ ์ž๋ฃŒ์˜ ํŠน์„ฑ์„ ๊ทœ๋ช…ํ•˜๋Š” ๋ฐฉ๋ฒ•๊ณผ ๊ธฐ๋ฒ• - ์ถ”๋ก ํ†ต๊ณ„: ํ•œ ๋ชจ์ง‘๋‹จ..
CNN(Convolutional Neural Network)
ยท
Data Science/DL
CNN(Convolutional Neural Network) Convolution์„ ํ†ตํ•œ Filtering๊ณผ Max Pooling์„ ๋ฐ˜๋ณตํ•˜์—ฌ ์ •๋ง ์ค‘์š”ํ•œ ํŠน์ง•์„ ์ •์ œํ•œ ํ›„ Classificationํ•˜๋Š” ๊ฒƒ ๊ณผ์ • Feature extraction + Classification - Feature extraction: Convolutional layer + max pooling, ํŠน์ง•์„ ๋ฝ‘์•„๋ƒ„ - Classification: Fully-connected layer, ๋ฌผ์ฒด๋ฅผ ํŒ๋‹จํ•จ Neural Network ์ด์ „์—๋Š”? Adaptive boosting(Adaboost) :์‚ฌ๋žŒ์ด ๊ธฐ์ •์˜ํ•œ ํ•„ํ„ฐ๋ฅผ ํฌ๊ธฐ์™€ ๋ชจ์–‘์„ ๋ฐ”๊พธ๊ณ  ํšŒ์ „์‹œํ‚ค๋ฉฐ ๋นจ๊ฐ„์ƒ‰์˜ ๋ฐ”์šด๋”ฉ ๋ฐ•์Šค ์˜์—ญ์— ๋Œ€ํ•ด computation ์—ฐ์‚ฐ์„ ํ•จ ์ขŒ์ƒ๋‹จ์—์„œ๋ถ€ํ„ฐ ์Šฌ๋ผ์ด๋”ฉ..
ML 0. ํ†ต๊ณ„๋ถ„์„
ยท
Data Science/ML
ํ†ต๊ณ„๋ถ„์„(Statistical Analsis) 1. ํ†ต๊ณ„ํ•™๊ณผ ๊ธฐ์ˆ ํ†ต๊ณ„ 2. ํ™•๋ฅ ๋ถ„ํฌ 3. ํ†ต๊ณ„์  ์ถ”๋ก ๊ณผ ํ†ต๊ณ„์  ๊ฒ€์ • 1. ๊ธฐ์ˆ ํ†ต๊ณ„ ๋ฐ์ดํ„ฐ์˜ ์†์„ฑ์„ ํŠน์ •ํ•œ ํ†ต๊ณ„๋Ÿ‰์„์‚ฌ์šฉํ•ด ์ •๋ฆฌ, ์š”์•ฝ, ์„ค๋ช…ํ•˜๋Š” ๋ฐฉ๋ฒ• ์ค‘์‹ฌ์ฒ™๋„ ์ค‘์‹ฌ๊ฒฝํ–ฅ์„ฑ: ์ค‘์‹ฌ์ ์ธ ๊ฒฝํ–ฅ์„ ๋‚˜ํƒ€๋‚ด๋Š” ์ฃผ์š”ํ•œ ๊ธฐ์ˆ ํ†ต๊ณ„ ์‚ฐ์ˆ ํ‰๊ท , ์ค‘์•™๊ฐ’, ์ตœ๋นˆ์น˜ ์‚ฐํฌ์ฒ™๋„ ๋ฐ์ดํ„ฐ๊ฐ€ ํผ์ ธ์žˆ๋Š” ์ •๋„๋ฅผ ์„ค๋ช…ํ•˜๋Š” ๊ธฐ์ˆ ํ†ต๊ณ„ ๋ฒ”์œ„, ๋ถ„์‚ฐ, ํ‘œ์ค€ํŽธ์ฐจ, ์‚ฌ๋ถ„์œ„์ˆ˜ ๋ฒ”์œ„(IQR) ๋ถ„ํฌ๋ชจ์–‘ ๋ฐ์ดํ„ฐ๊ฐ€ ํผ์ ธ์žˆ๋Š” ํ˜•ํƒœ๋ฅผ ๋‚˜ํƒ€๋‚ธ ๊ฒƒ ๋„์ˆ˜๋ถ„ํฌ, ๋น„๋Œ€์นญ๋„(์™œ๋„, ์น˜์šฐ์นœ ์ •๋„), ์ฒจ๋„(๋พฐ์กฑํ•œ๊ฐ€ ์™„๋งŒํ•œ๊ฐ€) 1) ์‚ฐํฌ์ฒ™๋„ - ๋ฒ”์œ„: ์ตœ๋Œ€๊ฐ’๊ณผ ์ตœ์†Œ๊ฐ’์˜ ์ฐจ์ด - ์ œ๊ณฑ์˜ ํ•ฉ(Sum of Squares) ํŽธ์ฐจ๋Š” ๋‹ค ๋”ํ•˜๋ฉด 0์ด ๋˜๋Š” ๋”œ๋ ˆ๋งˆ๊ฐ€ ์ƒ๊ธด๋‹ค → MSE(์ œ๊ณฑํ•˜์—ฌ ํ•ฉ) / MAE(์ ˆ๋Œ€๊ฐ’์˜ ํ•ฉ) ์ž˜ ์•Œ๋ ค์ง„ ์˜ˆ๋Š” ํ‘œ์ค€ํŽธ์ฐจ, ..
์•Œ๊ณ ๋ฆฌ์ฆ˜ ์Šคํ„ฐ๋””0. ์‹œ๊ฐ„๋ณต์žก๋„์™€ ๋น…์˜ค(Big O)
ยท
Data Science/coding pratice
1. ์‹œ๊ฐ„๋ณต์žก๋„๋ž€? (Time Complexity) ์‹คํ–‰์‹œ๊ฐ„(running time)์ด๋ž€ ํ•จ์ˆ˜/์•Œ๊ณ ๋ฆฌ์ฆ˜ ์ˆ˜ํ–‰์— ํ•„์š”ํ•œ ์Šคํ…(step) ์ˆ˜ ๊ฐ ๋ผ์ธ์„ ์ˆ˜ํ–‰ํ•˜๊ธฐ ์œ„ํ•ด ํ•„์š”ํ•œ ์Šคํ… ์ˆ˜๋Š” ์ƒ์ˆ˜(constant)๋ผ๊ณ  ๊ฐ€์ • T(N) = c1 + c2*(N+1) + c3*N + 1 = (c2+c3)*N + c1 + c2 + 1 = a*N + b N์ด ์ž‘์„ ๋•Œ์˜ ์‹คํ–‰์‹œ๊ฐ„์€ ์˜๋ฏธ๊ฐ€ ์—†๋‹ค. N์ด ๋ฌดํ•œ๋Œ€๋กœ ๊ฐˆ ๋•Œ N์ด ์ปค์งˆ์ˆ˜๋ก ๋œ ์ค‘์š”ํ•œ ๊ฒƒ์€ ์ œ๊ฑฐ(b ์ œ๊ฑฐ) ์ตœ๊ณ ์ฐจํ•ญ๋งŒ์ด ์˜๋ฏธ๋ฅผ ๊ฐ–๊ฒŒ๋˜๋ฉฐ(์—ฌ๊ธฐ์„œ๋Š” N) ์ตœ๊ณ ์ฐจํ•ญ์˜ ๊ณ„์ˆ˜(a) ๋˜ํ•œ ์˜๋ฏธ๊ฐ€ ์—†๋‹ค.(a ์ œ๊ฑฐ) → N๋งŒ ๋‚จ์Œ (Big) theta N = N → ์ ๊ทผ์  ๋ถ„์„์— ๋”ฐ๋ฅธ ์ ๊ทผ์  ํ‘œ๊ธฐ๋ฒ• ๋˜ํ•œ ์‹œ๊ฐ„๋ณต์žก๋„๋Š” ํ•จ์ˆ˜์˜ ์‹คํ–‰์‹œ๊ฐ„์„ ์ ๊ทผ์  ๋ถ„์„์„ ํ†ตํ•ด ์ ๊ทผ์  ํ‘œ๊ธฐ๋ฒ•์œผ๋กœ ํ‘œํ˜„ํ•œ๋‹ค. ์‹œ๊ฐ„..
ML 2. Logistic Regression(๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€๋ชจํ˜•)
ยท
Data Science/ML
1. Logistic Regression(๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€๋ชจํ˜•) ๊ฐœ๋… 2. Confusion Matrix์™€ AUROC 3. Multiclass Classification 1. Logistic Regression(๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€๋ชจํ˜•) ๊ฐœ๋… ์—ฐ์†ํ˜• ๋ฐ์ดํ„ฐ๋ฅผ inputํ•˜์—ฌ sigmoid ํ•จ์ˆ˜๋ฅผ ํ†ตํ•ด ์ด์‚ฐํ˜•(๋ฒ”์ฃผํ˜•) ๋ฐ์ดํ„ฐ๋ฅผ ๋„์ถœ(output) ex)binary classfication Linear Regression + Rogistic Functoin (์„ ํ˜•ํšŒ๊ท€ + ๋กœ์ง€์Šคํ‹ฑ ํ•จ์ˆ˜) ์ฆ‰, ์ •๋‹ต์ด ๋ฒ”์ฃผํ˜•์ผ ๋•Œ ์‚ฌ์šฉํ•˜๋Š” Regression Model ์ผ๋ฐ˜ ์„ ํ˜•/๋น„์„ ํ˜• ํšŒ๊ท€๋ชจ๋ธ๊ณผ ๊ฐ™์ด ์—ฐ์†ํ˜• ๋ฐ์ดํ„ฐ๋ฅผ ๋„ฃ์–ด์ฃผ๋Š” ๊ฒƒ(input)์€ ๊ฐ™์Œ https://www.youtube.com/watch?v=14eTDPJLkis 2. Con..
Python numpy: reshape(-1, n) ์—์„œ -1์€ ๋ญ˜๊นŒ?
ยท
Data Science/Python ๊ธฐ์ดˆ
Regression ์ธ๊ฐ•๋“ฃ๋Š” ๋„์ค‘ ๋‚˜์˜จ reshape(-1, 1)์„ ๋ณด๊ณ  -1์ด ์™œ ๋“ค์–ด๊ฐ€๋Š”๊ฑธ๊นŒ? ํ•˜๊ณ  ๊ถ๊ธˆํ•ด์„œ ์ฐพ์•„๋ด„ ๊ฒฐ๋ก ๋ถ€ํ„ฐ, -1์€ ๋งˆ์น˜ n๊ณผ ๊ฐ™์ด ๊ฐ€๋ณ€์ ์ž„์„ ๋‚˜ํƒ€๋‚ด๋Š” ์ˆซ์ž์ด๋ฉฐ, ๋’ค์˜ ์—ด์˜ ์ˆซ์ž์— ๋”ฐ๋ผ์„œ ๋ชจ๋“  ์›์†Œ๊ฐ€ ๋ˆ„๋ฝ๋˜์ง€ ์•Š๋„๋ก ํ–‰๋ ฌ์„ ๊ตฌ์„ฑํ•ด์คŒ ex) a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]]) a.shape >> (3, 4) a.reshape(-1, 12) >> ์—ด์ด 12๊ฐœ๊ฐ€ ๋˜๋„๋ก ํ–‰์ด 1๊ฐœ๊ฐ€ ๋จ a.reshape(-1, 6) a.reshape(-1, 2) a.reshape(-1,12) ์—ด์ด 6๊ฐœ๊ฐ€ ๋˜์–ด์•ผ ํ•˜๋ฏ€๋กœ 2x6 ํ–‰๋ ฌ ๊ตฌ์„ฑ ์—ด์ด 2๊ฐœ๊ฐ€ ๋˜์–ด์•ผ ํ•˜๋ฏ€๋กœ 6x2 ํ–‰๋ ฌ ๊ตฌ์„ฑ ์—ด์ด 12๊ฐœ๊ฐ€ ๋˜์–ด์•ผ ํ•˜๋ฏ€๋กœ 1x12 ํ–‰๋ ฌ ๊ตฌ์„ฑ 1 3 5 7 9 11 ..
๊นƒํ—ˆ๋ธŒ
ยท
์นดํ…Œ๊ณ ๋ฆฌ ์—†์Œ
https://zzsza.github.io/category/data/ Data Data์™€ ๊ด€๋ จ๋œ ๊ธ€์„ ์ž‘์„ฑํ•˜๋Š” ๊ณต๊ฐ„ zzsza.github.io ๋„์›€์„ ๋งŽ์ด ๋ฐ›์€ ๊นƒํ—ˆ๋ธŒ ์—ด์‹ฌํžˆํ•ด๋ณด์ž... ์ •๋ง... 2022.11.21 ๋‚ด ๊นƒํ—ˆ๋ธŒ๋ฅผ ์ƒ์„ฑํ–ˆ๋‹ค ์•„์ง ๋ฐ”๋ณด ๊ทธ ์ž์ฒด์ด๋‚˜ ํ™”์ดํŒ…! https://github.com/Keemjaehee Keemjaehee - Overview GitHub is where Keemjaehee builds software. github.com https://www.youtube.com/watch?v=lelVripbt2M ๊นƒํ—™ ์ดˆ๊ธฐ์„ค์ • ํ•œ๊ตญ์–ด๋ฒ„์ „ ์ตœ๊ณ ์˜ ์˜์ƒ
Data Scientist
ยท
Idea/๋ถ„์•ผ ํƒ๋ฐฉ
https://github.com/Team-Neighborhood/I-want-to-study-Data-Science/wiki/%EB%8D%B0%EC%9D%B4%ED%84%B0-%EC%82%AC%EC%9D%B4%EC%96%B8%ED%8B%B0%EC%8A%A4%ED%8A%B8 GitHub - Team-Neighborhood/I-want-to-study-Data-Science: ๋ฐ์ดํ„ฐ ์‚ฌ์ด์–ธ์Šค๋ฅผ ๊ณต๋ถ€ํ•˜๊ณ  ์‹ถ์€ ๋ถ„๋“ค์„ ์œ„ํ•œ ๊ธ€ ๋ฐ์ดํ„ฐ ์‚ฌ์ด์–ธ์Šค๋ฅผ ๊ณต๋ถ€ํ•˜๊ณ  ์‹ถ์€ ๋ถ„๋“ค์„ ์œ„ํ•œ ๊ธ€ . Contribute to Team-Neighborhood/I-want-to-study-Data-Science development by creating an account on GitHub. github.com ๋ฐ์ดํ„ฐ ์‚ฌ์ด์–ธํ‹ฐ์ŠคํŠธ๋ž€ ๋ฌด..