INDEX
Explanations
references to specific individuals in a relational context
New Auto-Interp
Negative Logits
HING
-0.18
231
-0.16
Äįer
-0.15
resher
-0.15
orent
-0.14
eyer
-0.14
olle
-0.14
rij
-0.14
phia
-0.14
lla
-0.14
POSITIVE LOGITS
holm
0.14
à¹ģรà¸ģ
0.14
.Zero
0.13
.ef
0.13
forCell
0.13
á»Ŀ
0.13
ormap
0.13
رÙĪØ·
0.13
iners
0.13
WF
0.13
Activations Density 0.267%