INDEX
Explanations
proper names or mentions of the name "Stuart"
references to the name "Stuart"
New Auto-Interp
Negative Logits
Solo
-0.74
chains
-0.74
USC
-0.74
snowball
-0.73
SCP
-0.73
ka
-0.71
ork
-0.70
jams
-0.69
Da
-0.69
jer
-0.68
POSITIVE LOGITS
Stuart
2.92
uart
2.08
Rupert
1.85
Nigel
1.73
Geoffrey
1.29
igel
1.28
Hugh
1.17
Louise
1.15
Ferdinand
1.14
ergus
1.09
Activations Density 0.034%