INDEX
Explanations
empathy and interpersonal connections in narratives
New Auto-Interp
Negative Logits
742
-0.17
層
-0.15
ictor
-0.14
TestCase
-0.14
ley
-0.14
ekim
-0.13
_UNDEFINED
-0.13
SHA
-0.13
tip
-0.13
Broadway
-0.13
POSITIVE LOGITS
him
0.41
lui
0.30
them
0.28
him
0.25
onun
0.25
ihn
0.24
ihm
0.24
него
0.23
ä»ĸ
0.22
them
0.22
Activations Density 0.406%