gpt-4

    Big model์„ ํ™œ์šฉํ•˜๊ธฐ ์œ„ํ•ด์„œ ๋‹น์‹ ์ด ๊ณ ๋ คํ•ด์•ผํ•  ๊ฒƒ(fine-tuning, knowledge distillation)

    ํ•ด๋‹น ๊ฒŒ์‹œ๊ธ€์€ ํ”ํžˆ ๋งํ•˜๋Š” Big model(GPT-3, BERT)๋“ค์„ ํ™œ์šฉํ•˜๊ธฐ ์œ„ํ•ด์„œ ๋ฌด์—‡์„ ๊ณ ๋ คํ•ด์•ผ ํ• ์ง€, ํŠนํžˆ (1) Fine-tuning (2) Knowledge distillation ์— ๋Œ€ํ•œ ๋‚ด์šฉ์„ ๋‹ด๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. GPT-3์˜ ์—…๊ทธ๋ ˆ์ด๋“œ ๋ฒ„์ „์ธ GPT-4๊ฐ€ ์ตœ๊ทผ์— ๋ฐœํ‘œ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ChatGPT๊ฐ€ ์„ฑ๊ณตํ•œ ์›์ธ๋„ GPT-3๋ผ๋Š” Big model์„ ํšจ๊ณผ์ ์œผ๋กœ ํ™œ์šฉํ–ˆ๊ธฐ ๋•Œ๋ฌธ์ด๋ผ๋Š” ์ƒ๊ฐ์ด ๋“œ๋Š”๋ฐ์š”. ์ด๋ ‡๋“ฏ ๊ธฐํ•˜๊ธ‰์ˆ˜์ ์ธ ์†๋„๋กœ ๋ฐœ์ „ํ•˜๊ณ  ์žˆ๋Š” pretrained big model๋“ค์„ ๋ฐ”๋กœ ์‚ฌ์šฉํ•  ์ˆ˜๋„ ์žˆ๊ฒ ์ง€๋งŒ, ์‹ค์ œ ์„œ๋น„์Šค๋‚˜ ์ ์šฉ ๋ถ„์•ผ์— ์ž˜ ํ™œ์šฉํ•  ์ค„ ์•„๋Š” ๊ฒƒ์ด ์ค‘์š”ํ•ด์ง„๋‹ค๊ณ  ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ๋•Œ ์—ฌ๋Ÿฌ๋ถ„๋“ค์ด ์•Œ์•„์•ผ ํ•  ๊ฒƒ๋“ค, ํŠนํžˆ fine-tuning๊ณผ knowledge distillation์— ๋Œ€ํ•ด์„œ ..