INDEX
Explanations
references to cannibalism and survival themes
New Auto-Interp
Negative Logits
ations
-0.18
Kirk
-0.17
ergus
-0.16
arus
-0.16
vie
-0.16
avin
-0.16
asis
-0.16
inus
-0.16
atively
-0.16
McCl
-0.15
POSITIVE LOGITS
ená
0.19
estado
0.18
olars
0.18
earn
0.18
ñ
0.18
idata
0.18
eg
0.17
iná
0.17
uada
0.17
dados
0.17
Activations Density 0.040%