[๋ ผ๋ฌธ๋ฆฌ๋ทฐ] Deep Ignorance: Filtering Pretraining Data Builds Tamper-Resistant Safeguards into Open-Weight LLMs
Robert Kirk์ด [arXiv]์ ๊ฒ์ํ โDeep Ignorance: Filtering Pretraining Data Builds Tamper-Resistant Safeguards into Open-Weight LLMsโ ๋ ผ๋ฌธ์ ๋ํ ์์ธํ ๋ฆฌ๋ทฐ์ ๋๋ค.