INDEX
Explanations
names of people or entities as subjects of actions
connections and references to specific entities or names
New Auto-Interp
Negative Logits
CE
-0.99
CE
-0.90
PE
-0.90
EU
-0.87
eely
-0.79
-0.78
ly
-0.78
325
-0.78
LE
-0.78
ELY
-0.76
POSITIVE LOGITS
Mull
1.27
Mall
1.14
Mug
1.10
Barb
1.10
Port
1.05
Loft
1.02
Jack
0.99
Skull
0.96
mug
0.93
Bung
0.93
Activations Density 0.556%