INDEX
Explanations
significant emotional investment or personal connection in written narratives
New Auto-Interp
Negative Logits
daq
-0.16
lat
-0.15
ivy
-0.15
buch
-0.15
enders
-0.15
_RAM
-0.15
ahoo
-0.15
ваÑĢ
-0.14
endo
-0.14
omat
-0.14
POSITIVE LOGITS
kin
0.17
оба
0.16
auf
0.15
operand
0.15
rub
0.14
jev
0.14
Hart
0.14
Kin
0.14
fort
0.14
MB
0.14
Activations Density 0.038%