INDEX
Explanations
references to survival and danger in high-stress scenarios
New Auto-Interp
Negative Logits
كومونز
-0.41
FFIX
-0.35
اللا
-0.33
Optimis
-0.31
Optimis
-0.31
нах
-0.31
']='
-0.30
Opti
-0.29
alapa
-0.29
้ง
-0.28
POSITIVE LOGITS
saved
0.82
averted
0.75
Saved
0.75
Saved
0.74
saved
0.67
lucky
0.65
rescued
0.64
richTextPanel
0.64
luckily
0.64
ToSave
0.64
Activations Density 0.335%