INDEX
Explanations
mentions of specific people, places, and events
New Auto-Interp
Negative Logits
воÑĢ
-0.18
ÙĴÙĩ
-0.15
ainer
-0.15
aks
-0.15
brains
-0.14
BuilderFactory
-0.14
LES
-0.14
quo
-0.14
Ỽ
-0.14
aukee
-0.13
POSITIVE LOGITS
Sunder
0.16
etta
0.15
752
0.15
<Scalar
0.14
948
0.14
elle
0.14
&
0.14
proceeding
0.14
whip
0.14
Morton
0.14
Activations Density 0.114%