Statistics

๋ฒ ์ด์ง€์•ˆ ํ†ต๊ณ„๋กœ ๋ณด๋Š” 2022 ๋Œ€์„  ๊ฒฐ๊ณผ ์˜ˆ์ธก (Bayesian Hierarchical Modeling)

guava 2023. 1. 28. 22:57

ํ•ด๋‹น ๊ฒŒ์‹œ๊ธ€์€

(1) Bayesian Hierarchical modeling์˜ ๊ฐœ๋…

(2) Bayesian Hierarchical modeling ํ™œ์šฉ ์‚ฌ๋ก€ - 2022 ๋Œ€์„  ๊ฒฐ๊ณผ ์˜ˆ์ธก

์— ๋Œ€ํ•œ ๋‚ด์šฉ์„ ๋‹ด๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

 


Chapter #1 - Bayesian Hierarchical modeling์˜ ๊ฐœ๋…

Bayesian Hierarchical modeling์€ ๋ฌด์—‡์ด๊ณ , ์™œ ์“ฐ๋Š” ๊ฒƒ์ธ๊ฐ€์š”?

Photo of Bonnie Kittle

Hierarchical modeling ์ด๋ž€, ์•Œ๊ณ  ์‹ถ์€ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์—ฌ๋Ÿฌ ๊ณ„์ธต์œผ๋กœ ๊ตฌ๋ถ„ํ•˜์—ฌ ์ถ”์ •ํ•˜๋Š” ๋ฐฉ๋ฒ•๋ก ์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค.

๋ฒ ์ด์ฆˆ ์ •๋ฆฌ๋ฅผ ํ™œ์šฉํ•˜์—ฌ ์‚ฌ์ „ (prior) ๋ถ„ํฌ๋ฅผ ๊ฐ€์ •ํ•˜๊ณ , ์‚ฌํ›„ (posterior) ๋ถ„ํฌ๋ฅผ ์ถ”์ •ํ•˜๋Š” ๋ฒ ์ด์ง€์•ˆ ๋ฐฉ๋ฒ•๋ก ๊ณผ ๊ฐ™์ด ์“ฐ์ผ ๋•Œ Bayesian Hierarchical modeling์ด๋ผ ํ•ฉ๋‹ˆ๋‹ค. ์ผ๋ฐ˜์ ์œผ๋กœ ์‹คํ—˜์ด๋‚˜ ์—ฌ๋ก ์กฐ์‚ฌ์—์„œ ์„œ๋กœ ๋‹ค๋ฅธ ์ง‘๋‹จ์— ๋Œ€ํ•ด ์—ฌ๋Ÿฌ ์ฐจ๋ก€ ์ง„ํ–‰๋œ ์กฐ์‚ฌ ๊ฒฐ๊ณผ๋ฅผ ์ข…ํ•ฉํ•˜๊ธฐ ์œ„ํ•œ ๋ฐฉ์‹์œผ๋กœ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ํ•œ ๋ฒˆ ์‹ค์‹œ๋œ ์กฐ์‚ฌ๋ณด๋‹ค๋Š” ์—ฌ๋Ÿฌ ์ฐจ๋ก€ ์‹ค์‹œ๋œ ์กฐ์‚ฌ๋ฅผ ์ข…ํ•ฉํ•˜๋ฉด ๋” ์‹ ๋ขฐํ•  ์ˆ˜ ์žˆ๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ๊ฒ ์ฃ .

 

์˜ˆ๋ฅผ ๋“ค์–ด ์‹คํ—˜์‹ค์—์„œ ์–ด๋–ค ์‹ ์•ฝ์„ ๊ฐœ๋ฐœํ•˜์˜€๊ณ , ์‹ ์•ฝ์ด ์•”์˜ ๋ฐœ๋ณ‘ ํ™•๋ฅ ์„ ๋‚ฎ์ถœ ๊ฐ€๋Šฅ์„ฑ์ด ์žˆ๋‹ค๊ณ  ๊ฐ€์ •ํ•ด ๋ด…์‹œ๋‹ค.

์•”์€ ์‹ ์•ฝ ์ฒ˜์น˜ ์—ฌ๋ถ€์™€ ๊ด€๋ จ ์—†์ด ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์—, ์•„๋ฌด๋Ÿฐ ์ฒ˜์น˜๋„ ํ•˜์ง€ ์•Š์€ ๊ฒฝ์šฐ์— ์•”์ด ๋ฐœ๋ณ‘ํ•  ํ™•๋ฅ ์„ ๋จผ์ € ์•Œ์•„์•ผ ํ•ฉ๋‹ˆ๋‹ค.

๋”ฐ๋ผ์„œ ์•„๋ฌด๋Ÿฐ ์ฒ˜์น˜๋„ ํ•˜์ง€ ์•Š์•˜์„ ๋•Œ ์•”์ด ์–ด๋А์ •๋„ ํ™•๋ฅ ๋กœ ๋ฐœ์ƒํ•˜๋Š”์ง€ ํ™•์ธํ•˜๊ธฐ ์œ„ํ•ด, ์ฅ๋ฅผ ๋Œ€์ƒ์œผ๋กœ ์—ฌ๋Ÿฌ ์ฐจ๋ก€ ์‹คํ—˜์„ ์ง„ํ–‰ ํ•˜์˜€์Šต๋‹ˆ๋‹ค.

* ๋ฐ์ดํ„ฐ ์ถœ์ฒ˜ : Bayesian Data Analysis - Rat Tumor

 

์ด 71๋ฒˆ์˜ ์„œ๋กœ ๋…๋ฆฝ์ ์ธ ์‹คํ—˜์„ ์ง„ํ–‰ ํ•˜์˜€๊ณ , ์‹คํ—˜ ๊ฒฐ๊ณผ๋Š” ์•„๋ž˜์™€ ๊ฐ™์ด ํ‘œ์‹œํ•ฉ๋‹ˆ๋‹ค.

 

  • $ y_j $ : j๋ฒˆ์งธ ์‹คํ—˜์—์„œ ์•”์ด ๋ฐœ๋ณ‘ํ•œ ์ฅ์˜ ์ˆ˜
  • $ n_j $ : j๋ฒˆ์งธ ์‹คํ—˜์—์„œ ์ „์ฒด ์ฅ์˜ ์ˆ˜
  • $ j = 1,2, ..., 71 $
  • ๋ฐ์ดํ„ฐ ํ˜•ํƒœ : $(y_j, n_j)$
    • (0, 20), (0, 18), (4,20), ...

๊ฐ ์‹คํ—˜์—์„œ ์ฅ์˜ ์ˆ˜๋Š” $n_j$ ๋งˆ๋ฆฌ์ด๊ณ , ์•”์ด ๋ฐœ๋ณ‘ํ•œ ์ฅ์˜ ์ˆ˜๋Š” $y_j$ ๋งˆ๋ฆฌ์ž…๋‹ˆ๋‹ค.

๊ฐ ์‹คํ—˜์—์„œ ์•”์ด ๋ฐœ๋ณ‘ํ•  ํ™•๋ฅ ์„ $\theta_j$ ๋ผ๊ณ  ํ•˜๋ฉด, $y_j$ ๋Š” ์ดํ•ญ๋ถ„ํฌ๋ฅผ ๋”ฐ๋ฅด๊ธฐ ๋•Œ๋ฌธ์— ์•„๋ž˜์™€ ๊ฐ™์ด ์ž‘์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

* ์ดํ•ญ๋ถ„ํฌ : ์—ฐ์†๋œ n๋ฒˆ์˜ ๋…๋ฆฝ์ ์ธ ์‹œํ–‰์—์„œ ๊ฐ ์‹œํ–‰์ด ํ™•๋ฅ  p๋ฅผ ๊ฐ€์งˆ ๋•Œ์˜ ์ด์‚ฐ ํ™•๋ฅ  ๋ถ„ํฌ

 

$$ y_j | \theta_j \sim Bin(n_j, \theta_j) $$

 

๊ทธ๋Ÿผ ์„œ๋กœ ๋…๋ฆฝ์ธ ์—ฌ๋Ÿฌ ๋ฒˆ์˜ ์‹คํ—˜ ๊ฒฐ๊ณผ๋ฅผ ์–ด๋–ป๊ฒŒ ์ข…ํ•ฉํ•  ์ˆ˜ ์žˆ์„๊นŒ์š”? 

 

Option #1 : ๋ชจ๋“  ์‹คํ—˜์ด ๊ณตํ†ต์˜ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ๊ฐ–๋Š”๋‹ค๊ณ  ๊ฐ€์ •

์ฒซ ๋ฒˆ์งธ ๋ฐฉ๋ฒ•์€ ๋ชจ๋“  ์‹คํ—˜์— ๋Œ€ํ•ด ์•”์ด ๋ฐœ๋ณ‘ํ•  ํ™•๋ฅ ์€ ๋™์ผํ•˜๋‹ค๊ณ  ๊ฐ€์ •ํ•˜๋Š” ๊ฒƒ ์ž…๋‹ˆ๋‹ค. ์ฆ‰ $\theta_j = \theta$ ๋ผ๊ณ  ๊ฐ€์ •ํ•ฉ๋‹ˆ๋‹ค.

์ด ๊ฒฝ์šฐ, ๋ชจ๋“  ์‹คํ—˜ ๊ฒฐ๊ณผ๋ฅผ ์ข…ํ•ฉํ•ด์„œ ์•„๋ž˜์™€ ๊ฐ™์ด ์ž‘์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

 

$$ y_1 + y_2 + ... + y_{71}  | \theta \sim Bin(n_1 + n_2 + ... + n_71, \theta) $$

 

๋ฌผ๋ก  ์ด๋ ‡๊ฒŒ ๋‹จ์ˆœํ•˜๊ฒŒ ๊ฐ€์ •ํ•  ๊ฒฝ์šฐ, ํ•œ๊ณ„๊ฐ€ ๋ช…ํ™•ํ•ฉ๋‹ˆ๋‹ค.

  • ์‹ค์ œ ๋ฐ์ดํ„ฐ์—์„œ๋Š” ์•”์ด ๋ฐœ๋ณ‘ํ•œ ๋น„์œจ์ด ์‹คํ—˜๋งˆ๋‹ค ๋‹ค๋ฆ„. ์ฆ‰ ํ™•๋ฅ  ๋ชจ๋ธ๊ณผ ๋ฐ์ดํ„ฐ๊ฐ€ ๋ถˆ์ผ์น˜
  • ์„œ๋กœ ๋‹ค๋ฅธ ์—ฌ๋Ÿฌ ์‹คํ—˜ ๊ฐ„์˜ variability๋ฅผ ๋‹ค๋ฃฐ ์ˆ˜๊ฐ€ ์—†์Œ

 

Option #2 : ๊ฐ ์‹คํ—˜๋งˆ๋‹ค ์„œ๋กœ ๋…๋ฆฝ์ธ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ๊ฐ–๋Š”๋‹ค๊ณ  ๊ฐ€์ •

๋‘ ๋ฒˆ์งธ ๋ฐฉ๋ฒ•์€ ๊ฐ ์‹คํ—˜๋งˆ๋‹ค ์„œ๋กœ ๋…๋ฆฝ์ธ ๋ฐœ๋ณ‘ ํ™•๋ฅ ์„ ๊ฐ–๋Š”๋‹ค๊ณ  ๊ฐ€์ •ํ•˜๋Š” ๊ฒƒ ์ž…๋‹ˆ๋‹ค. 

์ด ๊ฒฝ์šฐ ์‹คํ—˜๋งˆ๋‹ค ์•”์ด ๋ฐœ๋ณ‘ํ•œ ๋น„์œจ์ด ๋‹ค๋ฅด๊ฒŒ ๋‚˜ํƒ€๋‚œ ๋ฐ์ดํ„ฐ๋ฅผ ์„ค๋ช…ํ•  ์ˆ˜๋Š” ์žˆ์ง€๋งŒ, ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ ํ•œ๊ณ„๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

  • ๊ฐ ์‹คํ—˜์„ ๊ตฌ์„ฑํ•˜๋Š” ์ƒ˜ํ”Œ ํฌ๊ธฐ๊ฐ€ ์ž‘์Œ -> ์‚ฌํ›„ ํ™•๋ฅ ์˜ ๋ถ„์‚ฐ์ด ์ปค์ง
  • ์ƒˆ๋กœ์šด experiment ์— ๋Œ€ํ•œ ๋‹ต์„ ํ•  ์ˆ˜๊ฐ€ ์—†์Œ (๋ชจ๋“  ์‹คํ—˜์€ ๋‹ค๋ฅด๊ธฐ ๋•Œ๋ฌธ์—)

 

Option #3 : ๊ณ„์ธต ๊ตฌ์กฐ๋ฅผ ๊ฐ–๋Š” ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ๊ฐ€์ • (Hierarchical modeling)

๋งˆ์ง€๋ง‰์€ Option #1๊ณผ Option #2๋ฅผ ์ ˆ์ถฉํ•˜๋Š” ์•ˆ ์ž…๋‹ˆ๋‹ค.

์ฆ‰ $\theta_1$ ๋ถ€ํ„ฐ $\theta_71$์€ ์„œ๋กœ ๋‹ค๋ฅด์ง€๋งŒ, ๊ณตํ†ต์ ์ธ ๋ฌด์–ธ๊ฐ€๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์„ ๊ฑฐ๋ผ๋Š” ์•„์ด๋””์–ด ์ธ๋ฐ์š”,

์ด ๊ฒฝ์šฐ $\theta_j$ ๊ฐ€ ์–ด๋–ค ๋ถ„ํฌ๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๊ณ , ํ•ด๋‹น ๋ถ„ํฌ์—์„œ ๊ฐ ์‹คํ—˜์„ ๊ทœ์ •ํ•˜๋Š” ํŒŒ๋ผ๋ฏธํ„ฐ์ธ $\theta_j$๊ฐ€ ๋ฝ‘ํ˜”๋‹ค๊ณ  ๊ฐ€์ •ํ•ฉ๋‹ˆ๋‹ค.

 

๊ณ„์ธต ๊ตฌ์กฐ๋ฅผ ๊ฐ€์ง€๋Š” ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ๊ฐ€์ •ํ•˜๋ฉด ์œ„์™€ ๊ฐ™์€ ๋ชจ๋ธ๋ง์ด ๊ฐ€๋Šฅํ•˜๋ฉฐ,

์ด ๊ฒฝ์šฐ Option #1๊ณผ Option #2์—์„œ๋Š” ์—†์—ˆ๋˜ ์žฅ์ ์ด ์žˆ์Šต๋‹ˆ๋‹ค.

  • ์ผ๋ฐ˜ํ™” ๊ฐ€๋Šฅ
  • ์„œ๋กœ ๋‹ค๋ฅธ ์—ฌ๋Ÿฌ ์‹คํ—˜ ๊ฐ„์˜ variablility๋ฅผ ๋‹ค๋ฃฐ ์ˆ˜ ์žˆ์Œ
  • ๋‹ค๋ฅธ ์‹คํ—˜ ๊ฒฐ๊ณผ๋ฅผ ํ™œ์šฉํ•˜์—ฌ ๊ฐ $y_j$ ์— ๋Œ€ํ•œ ์˜ˆ์ธก ์ •ํ™•๋„๋ฅผ ๋†’์ผ ์ˆ˜ ์žˆ์Œ

๊ทธ๋Ÿผ $\theta_j$ ๊ฐ€ ๋”ฐ๋ฅด๋Š” ๋ถ„ํฌ๋Š” ๋ฌด์—‡์œผ๋กœ ๊ฐ€์ •ํ•˜๋ฉด ์ข‹์„๊นŒ์š”? ๋ฐ”๋กœ ๋ฒ ํƒ€๋ถ„ํฌ ์ž…๋‹ˆ๋‹ค.

$\theta_j$ ์™€ ๊ด€๋ จ๋œ ์ œ์•ฝ ์กฐ๊ฑด๋“ค์ด ์žˆ๋Š”๋ฐ, ๋ฒ ํƒ€ ๋ถ„ํฌ๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ํ•ด๋‹น ์ œ์•ฝ ์กฐ๊ฑด๋“ค์„ ๋งŒ์กฑ์‹œํ‚ฌ ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.

  • $\theta_j$ ๋Š” ์ดํ•ญ ๋ถ„ํฌ์˜ ํ™•๋ฅ ์„ ๊ทœ์ •ํ•˜๋Š” ๊ฐ’์ด๊ธฐ ๋•Œ๋ฌธ์— 0์—์„œ 1 ์‚ฌ์ด ๊ฐ’์„ ๊ฐ€์ ธ์•ผ ํ•จ
    • ๋ฒ ํƒ€๋ถ„ํฌ๋Š” 0์—์„œ 1 ์‚ฌ์ด ๊ฐ’์„ ๊ฐ€์ง
  •  ์ดํ•ญ ๋ถ„ํฌ์™€ ๊ฒฐํ•ฉํ•˜์—ฌ ์‚ฌํ›„ ๋ถ„ํฌ๋ฅผ ์˜ˆ์ธกํ•˜๊ธฐ ์‰ฌ์šด ๋ถ„ํฌ์—ฌ์•ผ ํ•จ
    • ๋ฒ ํƒ€๋ถ„ํฌ๋Š” ์ดํ•ญ ๋ถ„ํฌ์˜ conjugate prior

* conjugate prior : prior๊ณผ posterior ๊ฐ€ ๊ฐ™์€ probability distribution family๋ฅผ ์ด๋ฃจ๊ฒŒ ํ•˜๋Š” prior ๋ถ„ํฌ

 

๋”ฐ๋ผ์„œ ์•„๋ž˜์™€ ๊ฐ™์€ ๋ฌธ์ œ ์„ธํŒ…์ด ๊ฐ€๋Šฅํ•˜๋ฉฐ,

์ด์™€ ๊ฐ™์€ ๋ฌธ์ œ ์„ธํŒ…์„ Hierarchical (=multi-level) model ์ด๋ผ ํ•ฉ๋‹ˆ๋‹ค.

 

  • Lower level : $y_j | \theta_j \sim Bin(n_j, \theta_j) $
  • Higher level : $\theta_j \sim Beta(\alpha, \beta)$
  • $\alpha, \beta \sim iid \; Exp(\lambda)$

DAG Model

 

์—ฌ๊ธฐ์„œ Higher level์„ ๊ตฌ์„ฑํ•˜๋Š” ํŒŒ๋ผ๋ฏธํ„ฐ์ธ $\alpha$ ์™€ $\beta$๋ฅผ hyperparameters ๋ผ๊ณ  ํ•˜๋ฉฐ,

$\alpha$ ์™€ $\beta$ ๋„ ๋ถ„ํฌ๋ฅผ ๊ฐ–์Šต๋‹ˆ๋‹ค. ์ด๋“ค์˜ ์‚ฌ์ „ ๋ถ„ํฌ๋ฅผ hyperprior ์ด๋ผ ํ•ฉ๋‹ˆ๋‹ค.

 

์œ„ ์˜ˆ์‹œ์—์„œ๋Š” $\alpha$ ์™€ $\beta$๊ฐ€ ๊ฐ๊ฐ ํŒŒ๋ผ๋ฏธํ„ฐ๊ฐ€ $\lambda$ ์ธ ์ง€์ˆ˜๋ถ„ํฌ๋ฅผ ๋”ฐ๋ฅด๋Š” ๊ฒƒ์œผ๋กœ ๊ฐ€์ • ํ•˜์˜€๋Š”๋ฐ์š”,

$\alpha$ ์™€ $\beta$ ์— ๋Œ€ํ•œ ์ •๋ณด๊ฐ€ ์ „ํ˜€ ์—†์œผ๋ฏ€๋กœ too informative ํ•œ ๋ถ„ํฌ๊ฐ€ ์•„๋‹ˆ๋ผ๋ฉด ์‚ฌ์šฉ์ด ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.

์‚ฌํ›„ ๋ถ„ํฌ๊ฐ€ proper distribution ์ด ๋œ๋‹ค๋Š” ๊ฐ€์ • ํ•˜์— improper distribution ๋„ ์‚ฌ์šฉ์ด ๊ฐ€๋Šฅํ•œ ๊ฒฝ์šฐ๊ฐ€ ์žˆ์œผ๋ฉฐ, ์œ„์˜ ์˜ˆ์‹œ์—์„œ๋Š” ์ง€์ˆ˜ ๋ถ„ํฌ๋ฅผ ๊ฐ€์ • ํ–ˆ์œผ๋ฏ€๋กœ $\lambda$๊ฐ€ 0์— ๊ฐ€๊นŒ์šธ์ˆ˜๋ก ํ‰ํƒ„ํ•œ (not informative) ๋ถ„ํฌ๊ฐ€ ๋ฉ๋‹ˆ๋‹ค.

 


Chapter #2  - 2022 ๋Œ€์„  ๊ฒฐ๊ณผ ์˜ˆ์ธก

 

์ œ20๋Œ€ ๋Œ€ํ†ต๋ น ์„ ๊ฑฐ ๋ฒฝ๋ณด (์ถœ์ฒ˜ : ๋‚˜๋ฌด์œ„ํ‚ค)

 

๊ทธ๋Ÿผ ์œ„์—์„œ ์†Œ๊ฐœํ•œ Bayesian Hierarchical modeling ์„ ํ™œ์šฉํ•˜์—ฌ 2022๋…„ 3์›” 9์ผ์— ์น˜๋Ÿฌ์ง„ ์ œ20๋Œ€ ๋Œ€ํ†ต๋ น ์„ ๊ฑฐ ๊ฒฐ๊ณผ๋ฅผ ์˜ˆ์ธกํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ์ œ20๋Œ€ ๋Œ€ํ†ต๋ น ์„ ๊ฑฐ๋Š” ๋ง‰ํŒ๊นŒ์ง€ ์ ‘์ „ ์–‘์ƒ์„ ๋ณด์˜€๊ณ , ์ตœ์ข…์ ์œผ๋กœ 1-2์œ„๊ฐ€ 0.73%p ๋“ํ‘œ์œจ ์ฐจ์ด๋กœ ์—ญ๋Œ€ ์ตœ์†Œ ๋“ํ‘œ์œจ ์ฐจ์ด๋ฅผ ๊ธฐ๋กํ•œ ์„ ๊ฑฐ ์ž…๋‹ˆ๋‹ค.

* ๋ฐ์ดํ„ฐ ์ถœ์ฒ˜ : ์ œ20๋Œ€ ๋Œ€ํ†ต๋ น ์„ ๊ฑฐ

 

 

1. ๋ถ„์„ ๊ฐœ์š”

  • 2022๋…„ 2์›” ๋งˆ์ง€๋ง‰ ์ฃผ, ๊ฐ ์กฐ์‚ฌ ๊ธฐ๊ด€์˜ ์—ฌ๋ก ์กฐ์‚ฌ ๊ฒฐ๊ณผ
    • 2022๋…„ 2์›” ๋งˆ์ง€๋ง‰ ์ฃผ ๊ฒฐ๊ณผ๊ฐ€ ์—†๋Š” ๊ฒฝ์šฐ, 3์›” ๊ฒฐ๊ณผ๋กœ ๋Œ€์ฒด
    • ์ง€์ง€์œจ ์ƒ์œ„ 4์ž ์ด์ƒ์„ ๋Œ€์ƒ์œผ๋กœ ์‹ค์‹œ๋œ ์—ฌ๋ก ์กฐ์‚ฌ
  • ์˜ˆ์ธกํ•˜๊ณ ์ž ํ•˜๋Š” ๊ฒƒ (y) : ๋‹น์‹œ ์ง€์ง€์œจ top2๋ฅผ ๊ธฐ๋กํ•˜๋˜ ์ด์žฌ๋ช… ํ›„๋ณด์™€ ์œค์„์—ด ํ›„๋ณด์˜ ๋“ํ‘œ์œจ ์ฐจ์ด
  • ์‚ฌ์šฉํ•œ ํ”„๋กœ๊ทธ๋ž˜๋ฐ ํˆด : R์˜ rjags ํŒจํ‚ค์ง€
    • (์ฐธ๊ณ ) ํŒŒ์ด์ฌ์˜ ๊ฒฝ์šฐ pyjags ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์—์„œ ์œ ์‚ฌํ•œ ๊ธฐ๋Šฅ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.

 

๋ถ„์„์— ์‚ฌ์šฉํ•œ ์—ฌ๋ก ์กฐ์‚ฌ ๋ฐ์ดํ„ฐ๋Š” ์•„๋ž˜์™€ ๊ฐ™์Šต๋‹ˆ๋‹ค.

์ด์žฌ๋ช… ์ง€์ง€์œจ (%) ์œค์„์—ด ์ง€์ง€์œจ (%) ์˜ค์ฐจ ๋ฒ”์œ„ (%) ์กฐ์‚ฌ ๊ธฐ๊ด€ ์ฃผ์ฐจ
39.8 39.8 2.2 KBS 22๋…„ 2์›” 2์ฐจ
39.6 41.9 3.1 MBC 22๋…„ 2์›” 2์ฐจ
34.1 42.4 3.1 JTBC 22๋…„ 2์›” 2์ฐจ
34.9 36.5 3.1 TV์กฐ์„  22๋…„ 2์›” 3์ฐจ
36.4 43.3 3.1 ์ฑ„๋„A 22๋…„ 2์›” 2์ฐจ
31.6 36.1 3.1 ๋„ฅ์ŠคํŠธ๋ฆฌ์„œ์น˜ 22๋…„ 2์›”
37 39 3.1 NBS 22๋…„ 2์›” 4์ฃผ
38 37 3.1 ํ•œ๊ตญ๊ฐค๋Ÿฝ 22๋…„ 2์›” 4์ฃผ
38.3 39 3.1 ๋จธ๋‹ˆํˆฌ๋ฐ์ด 22๋…„ 2์›” 4์ฃผ
40.5 41.9 2.2 ๋ฆฌ์–ผ๋ฏธํ„ฐ 22๋…„ 2์›” 4์ฃผ ์ฃผ์ค‘
43.8 36.1 3.1 KSOI 22๋…„ 2์›” 4์ฃผ
39.4 40.2 3.1 ์— ๋ธŒ๋ ˆ์ธ(์ค‘์•™์ผ๋ณด) 22๋…„ 2์›” 2์ฐจ
40.2 42.4 3.1 ์— ๋ธŒ๋ ˆ์ธ(news1) 22๋…„ 2์›” 4์ฃผ
42.5 46.5 1.5 PNR(๋‰ด๋ฐ์ผ๋ฆฌ) 22๋…„ 3์›”
41 43.8 3.1 PNR(ํ”„๋ผ์ž„๊ฒฝ์ œ) 22๋…„ 2์›” 4์ฃผ
42.3 45.4 1.8 ์—ฌ๋ก ์กฐ์‚ฌ๊ณต์ • 22๋…„ 2์›” 4์ฃผ
42 44.2 2.6 ๋ฏธ๋””์–ดํ† ๋งˆํ†  22๋…„ 2์›” 4์ฃผ
41 46 3.1 ๋ฆฌ์„œ์น˜๋ทฐ 22๋…„ 2์›” 4์ฃผ
40.9 43.6 3.1 ํ•œ๊ธธ๋ฆฌ์„œ์น˜ 22๋…„ 3์›”
42.1 43.6 3.1 ์กฐ์›์”จ์•ค์•„์ด 22๋…„ 2์›” 3์ฃผ
42.2 43.2 3.1 ๋ฏธ๋””์–ด๋ฆฌ์„œ์น˜ 22๋…„ 2์›” 4์ฃผ
40 40.4 3.1 ์„œ๋˜ํฌ์ŠคํŠธ 22๋…„ 2์›” 4์ฃผ
39.5 44 3.1 ์ฝ”๋ฆฌ์•„์ •๋ณด๋ฆฌ์„œ์น˜ 22๋…„ 2์›” 4์ฃผ

 

2. ์‹œ๊ฐํ™”

์—ฌ๋ก ์กฐ์‚ฌ ๊ธฐ๊ด€๋“ค์˜ ๋ฐ์ดํ„ฐ๋ฅผ ๊ทธ๋ž˜ํ”„๋กœ ํ‘œํ˜„ํ•˜๋ฉด ์•„๋ž˜์™€ ๊ฐ™์Šต๋‹ˆ๋‹ค.

 

with(d[, c('organization','y', 'ME')],
	 plot(1:n, y, ylim=c(min(y-ME), max(y+ME)), xaxt="n",
        	main="์ด์žฌ๋ช… ํ›„๋ณด ์ง€์ง€์œจ (%) - ์œค์„์—ด ํ›„๋ณด ์ง€์ง€์œจ (%)",
        	xlab="", ylab="์ฐจ์ด"))
with(d[, c('organization','y', 'ME')], segments(1:n, y-ME, 1:n, y+ME))
axis(1, at=1:n, labels=d$organization, las=2)
abline(h=0, lty=2, col="blue")

 

์ œ20๋Œ€ ๋Œ€ํ†ต๋ น ์„ ๊ฑฐ - ์—ฌ๋ก ์กฐ์‚ฌ ๊ฒฐ๊ณผ

 

  • KSOI : ์ด์žฌ๋ช… ํ›„๋ณด์˜ ์šฐ์„ธ, ์˜ค์ฐจ ๋ฒ”์œ„ ๋ฐ–
  • JTBC, ์ฑ„๋„A, ๋ฆฌ์„œ์น˜๋ทฐ, ์ •๋ณด๋ฆฌ์„œ์น˜ : ์œค์„์—ด ํ›„๋ณด์˜ ์šฐ์„ธ, ์˜ค์ฐจ ๋ฒ”์œ„ ๋ฐ–
  • ๋‚˜๋จธ์ง€ ์—ฌ๋ก ์กฐ์‚ฌ๋“ค์€ ์œค์„์—ด ํ›„๋ณด์˜ ์šฐ์„ธ๊ฐ€ ๋งŽ์•˜์œผ๋‚˜ ์˜ค์ฐจ ๋ฒ”์œ„ ์•ˆ

y = "์ด์žฌ๋ช… ํ›„๋ณด ์ง€์ง€์œจ - ์œค์„์—ด ํ›„๋ณด ์ง€์ง€์œจ" ์˜ ํ‰๊ท  ๊ฐ’์€ -2.139 ์ž…๋‹ˆ๋‹ค.

 

 

3. ๋ฌธ์ œ ์„ธํŒ…

  • $y_j$ : ์ด์žฌ๋ช… ํ›„๋ณด๊ฐ€ ์œค์„์—ด ํ›„๋ณด๋ฅผ ์ง€์ง€์œจ์—์„œ ๋ช‡%p ์•ž์„œ๋Š”์ง€
    • "์ด์žฌ๋ช… ํ›„๋ณด ์ง€์ง€์œจ - ์œค์„์—ด ํ›„๋ณด ์ง€์ง€์œจ"
  • j = 1,2, ... , 23
  • ์„œ๋กœ ๋‹ค๋ฅธ ์—ฌ๋ก ์กฐ์‚ฌ ๋ผ๋ฆฌ๋Š” ์„œ๋กœ ๋…๋ฆฝ์ด๋‹ค.
  • $\sigma_j$ : margin of error (์˜ค์ฐจ ๋ฒ”์œ„) ์˜ 1/2
    • $\sigma_j$ ๋Š” ๊ณ ์ •๋œ ๊ฐ’์ด๊ณ , ์•Œ๊ณ  ์žˆ๋‹ค๊ณ  ๊ฐ€์ •ํ•ฉ๋‹ˆ๋‹ค.
    • ๋ฌผ๋ก  $\sigma_j$ ์— ๋Œ€ํ•ด์„œ๋„ prior distribution์„ ์ ์šฉํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค.
  • $y_j$ ๊ฐ’์€ ์ •๊ทœ๋ถ„ํฌ๋ฅผ ๋”ฐ๋ฅธ๋‹ค๊ณ  ๊ฐ€์ • ํ•˜์˜€์Šต๋‹ˆ๋‹ค.
  • $y_j$ ๊ฐ’์˜ ํ‰๊ท ์ธ $\theta_j$ ๋„ ์ •๊ทœ ๋ถ„ํฌ๋ฅผ ๋”ฐ๋ฅธ๋‹ค๊ณ  ๊ฐ€์ • ํ•˜์˜€์Šต๋‹ˆ๋‹ค.
  • $\theta_j$ ์˜ ๋ถ„ํฌ๋ฅผ ๊ทœ์ •ํ•˜๋Š” $\mu$ ์™€ $\tau$ ์— ๋Œ€ํ•ด์„œ๋Š” ์ •๋ณด๊ฐ€ ์ „ํ˜€ ์—†์œผ๋ฏ€๋กœ, ์ตœ๋Œ€ํ•œ non-informative prior์— ๊ฐ€๊นŒ์šด ๋ถ„ํฌ๋ฅผ ์‚ฌ์šฉ ํ•˜์˜€์Šต๋‹ˆ๋‹ค. (๋ฒ”์œ„๊ฐ€ ๋„“์€ ๊ท ์ผ๋ถ„ํฌ)
  • Lower level : $y_j | \theta_j \sim N(\theta_j, \sigma_j^2) $
  • Higher level : $\theta_j | \mu, \tau \sim N(\mu, \tau^2)$
  • $\mu \sim flat \; on \; (-\infty, \infty) \doteq U(-1000, 1000)$
  • $\tau \sim flat \; on \; (0, \infty) \doteq U(0, 1000)$

 

DAG Model

4. ๋ฌธ์ œ ํ’€์ด

Gibbs sampler๋ฅผ ํ™œ์šฉํ•˜์—ฌ ๋ถ„ํฌ๋ฅผ ๋งŒ์กฑํ•˜๋Š” ์ƒ˜ํ”Œ์„ ์ƒ์„ฑํ•ด ๋ด…์‹œ๋‹ค.

์šฐ์„  rjags ํŒจํ‚ค์ง€์—์„œ ํ™œ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ์œ„์˜ ๋ฌธ์ œ ์„ธํŒ…์„ ์ฝ”๋“œ๋กœ ๊ตฌํ˜„ํ•ด ์ค๋‹ˆ๋‹ค. 

model {

  for (j in 1:length(y)) {
    y[j] ~ dnorm(theta[j], 1/sigma[j]^2)
    theta[j] ~ dnorm(mu, 1/tau^2)
  }

  mu ~ dunif(-1000,1000)
  tau ~ dunif(0,1000)

}

์œ„ ์ฝ”๋“œ๋ฅผ polls2022_korean.bug ํŒŒ์ผ๋กœ ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.

 

library(rjags)

initial.vals <- list(
  list(mu=100, tau=0.01),
  list(mu=-100, tau=0.01),
  list(mu=100, tau=100),
  list(mu=-100, tau=100)
)

m1 <- jags.model("polls2022_korean.bug", d, initial.vals, n.chains=4)

# burn-in for 2,500
update(m1, 2500)
x1 <- coda.samples(m1, c("mu", "tau"), n.iter=5000)

rjags ํŒจํ‚ค์ง€๋ฅผ ๋ถˆ๋Ÿฌ ์˜ค๊ณ , $\mu$์™€ $\tau$์˜ ์ดˆ๊นƒ๊ฐ’์„ ์žก์•„ ์ค๋‹ˆ๋‹ค.

์ƒ˜ํ”Œ ์ถ”์ถœ์„ ์œ„ํ•œ chain์€ 4๊ฐœ๋ฅผ ์‚ฌ์šฉ ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

mixing์ด ์ž˜ ๋˜๋„๋ก ํ•˜๊ธฐ ์œ„ํ•ด ์ตœ์ดˆ ์ƒ˜ํ”Œ 2500๊ฐœ๋Š” burn-in ํ•ด ์ค๋‹ˆ๋‹ค.  

์—ฌ๋ก ์กฐ์‚ฌ๋“ค์˜ ๋ถ„ํฌ๋ฅผ ์ •์˜ํ•˜๋Š” ํ‰๊ท  $\mu$ ์™€ ํ‘œ์ค€ํŽธ์ฐจ์ธ $\tau$ ๊ฐ€ ๊ฐ€์žฅ ์ค‘์š”ํ•œ ๊ฐ’์ด๊ธฐ ๋•Œ๋ฌธ์—, 

์œ„ ๋ถ„ํฌ ์ •์˜์— ๋งž๊ฒŒ ๋ฝ‘์€ $\mu$, $\tau$ ์ƒ˜ํ”Œ์„ ๊ฐ๊ฐ 5์ฒœ๊ฐœ*4 = 2๋งŒ๊ฐœ์”ฉ ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.

 

traceplot ๊ณผ $\mu$, $\tau$ ์˜ ๋ถ„ํฌ๋ฅผ ๊ทธ๋ ค ๋ณด๋ฉด ์•„๋ž˜์™€ ๊ฐ™์Šต๋‹ˆ๋‹ค.

 

 

 

summary(x1)

Iterations = 3501:8500
Thinning interval = 1 
Number of chains = 4 
Sample size per chain = 5000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

      Mean     SD Naive SE Time-series SE
mu  -2.153 0.6695 0.004734       0.007539
tau  2.780 0.5847 0.004134       0.008289

2. Quantiles for each variable:

      2.5%    25%    50%    75%   97.5%
mu  -3.497 -2.582 -2.147 -1.717 -0.8376
tau  1.793  2.366  2.726  3.126  4.1035

์ฐธ๊ณ ๋กœ Iterations ๊ฐ€ 3501 ๋ถ€ํ„ฐ ์‹œ์ž‘ํ•˜๋Š” ์ด์œ ๋Š”,

์•„๊นŒ 2500๊ฐœ burn-in ์— ์ด์–ด์„œ auto-correlation์„ ์ค„์ด๊ธฐ ์œ„ํ•œ adaptation ๊ณผ์ •์ด ๊ธฐ๋ณธ ์˜ต์…˜์œผ๋กœ ๋“ค์–ด๊ฐ”๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค. 

adaptation ๊ณผ์ •์—์„œ 1000๊ฐœ sample์ด ์ถ”๊ฐ€์ ์œผ๋กœ ๋ฒ„๋ ค์ง‘๋‹ˆ๋‹ค.

 

$\mu$ ๊ฐ’์˜ ํ‰๊ท ์€ -2.153, ์ค‘๊ฐ„๊ฐ’์€ -2.147 ์ž…๋‹ˆ๋‹ค.

์ด๋Š” ์ตœ์ดˆ ์—ฌ๋ก ์กฐ์‚ฌ ๊ธฐ๊ด€๋“ค์˜ "์ด์žฌ๋ช… ์ง€์ง€์œจ - ์œค์„์—ด ์ง€์ง€์œจ" ์˜ ํ‰๊ท ์ด -2.139 ์˜€๋˜ ๊ฒƒ๊ณผ ์ผ๋งฅ์ƒํ†ตํ•˜๋Š” ๊ฒฐ๊ณผ์ž…๋‹ˆ๋‹ค.

ํ‰๊ท ์„ ๋ณด๋ฉด ์œค์„์—ด ํ›„๋ณด๊ฐ€ ์šฐ์„ธํ•  ํ™•๋ฅ ์ด ๊ฐ€๋Šฅ์„ฑ์ด ๋†’์ง€๋งŒ, $\tau$ ๊ฐ’์ด ๊ฝค ํฌ๊ธฐ ๋•Œ๋ฌธ์— ์‹ค์ œ ๊ฒฐ๊ณผ๋Š” ๋ณ€๋™์„ฑ์ด ํด ๊ฒƒ์œผ๋กœ ์˜ˆ์ธก์ด ๋ฉ๋‹ˆ๋‹ค.

 

$\mu$์˜ 95% ์‚ฌํ›„ ํ™•๋ฅ  ๊ตฌ๊ฐ„์€ -3.497 ~ -0.8376 ์ž…๋‹ˆ๋‹ค. 

์ด์ œ ๋ฝ‘์€ 2๋งŒ๊ฐœ $\mu$, $\tau$ ๊ฐ’๋“ค์„ ๊ฐ€์ง€๊ณ  ๋Œ€์„  ๊ฒฐ๊ณผ๋ฅผ ์˜ˆ์ธกํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

 

post <- as.matrix(x1)
Nsim <- dim(post)[1]
ys <- vector(length=Nsim)

for (s in 1:Nsim){
  now_mu <- post[s, 1]
  now_tau <- post[s, 2]
  now_theta <- rnorm(1, now_mu, now_tau)
  now_y <- rnorm(1, now_theta, 1/2*(3.1)) # ๊ฐ ์—ฌ๋ก ์กฐ์‚ฌ ๊ธฐ๊ด€์˜ ์˜ค์ฐจ๋ฒ”์œ„๋Š” 3.1%๋กœ ๊ฐ€์ •
  ys[s] <- now_y
}

# ์ด์žฌ๋ช… ํ›„๋ณด ๋‹น์„  ํ™•๋ฅ 
> print(sum(ys > 0) / length(ys))
[1] 0.25025
# ์œค์„์—ด ํ›„๋ณด ๋‹น์„  ํ™•๋ฅ 
> print(sum(ys < 0) / length(ys))
[1] 0.74975


์ด์žฌ๋ช… ํ›„๋ณด์˜ ๋‹น์„  ํ™•๋ฅ ์€ ์•ฝ 0.25 (25%), ์œค์„์—ด ํ›„๋ณด์˜ ๋‹น์„  ํ™•๋ฅ ์€ ์•ฝ 0.75 (75%) ์ž…๋‹ˆ๋‹ค.

 


Summary

์ด๋ฒˆ ํฌ์ŠคํŒ…์—์„œ๋Š” Bayesian Hierarchical modeling ์— ๋Œ€ํ•ด ์•Œ์•„ ๋ณด์•˜๊ณ , ์˜ˆ์‹œ๋กœ ์ œ20๋Œ€ ๋Œ€ํ†ต๋ น์„ ๊ฑฐ ๊ฒฐ๊ณผ๋ฅผ ์˜ˆ์ธกํ•ด ๋ณด์•˜์Šต๋‹ˆ๋‹ค.

 

Bayesian Hierarchical modeling์€ ์‹คํ—˜์ด๋‚˜ ์—ฌ๋ก ์กฐ์‚ฌ์™€ ๊ฐ™์€, ์„œ๋กœ ๋‹ค๋ฅธ ์ง‘๋‹จ์— ๋Œ€ํ•ด ์—ฌ๋Ÿฌ ์ฐจ๋ก€ ์ง„ํ–‰๋œ ์กฐ์‚ฌ ๊ฒฐ๊ณผ๋ฅผ ์ข…ํ•ฉํ•  ๋•Œ ์œ ์šฉํ•œ ๋„๊ตฌ์ž…๋‹ˆ๋‹ค. ์—ฌ๋Ÿฌ ์กฐ์‚ฌ๋“ค์˜ ํ‰๊ท ๋งŒ ๊ตฌ๋งค ๋ณด๋Š” ๊ฒƒ ๋ณด๋‹ค ํ›จ์”ฌ ํ’๋ถ€ํ•œ ๋ถ„์„ ๊ฒฐ๊ณผ๋ฅผ ์–ป์œผ์‹ค ์ˆ˜ ์žˆ์œผ๋‹ˆ, ๊ด€์‹ฌ ๋ถ„์•ผ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ๋ชจ์•„ ์ง์ ‘ ๋ถ„์„์„ ์ˆ˜ํ–‰ํ•ด ๋ณด์…”๋„ ์ข‹๊ฒ ์Šต๋‹ˆ๋‹ค.

 

์ž˜๋ชป๋œ ๋‚ด์šฉ์— ๋Œ€ํ•œ ์ง€์ ์ด๋‚˜ ๋ฌธ์˜์‚ฌํ•ญ์€ ํŽธํ•˜๊ฒŒ ๋Œ“๊ธ€๋กœ ๋‚จ๊ฒจ์ฃผ์„ธ์š”! ์ฝ์–ด ์ฃผ์…”์„œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค.

 


References

[1] Gelman, A., Carlin, J.B., Stern, H.S., Dunson, D.B., Vehtari, A., & Rubin, D.B. (2013). Bayesian Data Analysis (3rd ed.). Chapman and Hall/CRC. https://doi.org/10.1201/b16018

 

Bayesian Data Analysis | Andrew Gelman, John B. Carlin, Hal S. Stern,

Winner of the 2016 De Groot Prize from the International Society for Bayesian AnalysisNow in its third edition, this classic book is widely considered the

www.taylorfrancis.com

[2] Park, Trevor H. “Hierarchical modeling fundementals”, Advanced Bayesian Modeling, 2022 spring, University of Illinois Urbana Champaign, Lecture.

 

[3] ์ œ20๋Œ€ ๋Œ€ํ†ต๋ น ์„ ๊ฑฐ (๋‚˜๋ฌด์œ„ํ‚ค)