INDEX
Explanations
elements related to character development and interactions in narratives
New Auto-Interp
Negative Logits
çĽĹ
-0.15
eny
-0.14
ACC
-0.14
butt
-0.14
iren
-0.14
ODY
-0.14
nghiá»ĩp
-0.14
éŀ
-0.14
ferred
-0.13
ardo
-0.13
POSITIVE LOGITS
strength
0.18
strength
0.17
Strength
0.16
/inet
0.15
åłĤ
0.15
against
0.15
Westbrook
0.14
Cust
0.14
strengths
0.14
ckett
0.14
Activations Density 0.304%