INDEX
Explanations
phrases indicating opinion and evaluation of sports teams or performances
New Auto-Interp
Negative Logits
inson
-0.15
isten
-0.15
rompt
-0.14
surprisingly
-0.14
Effects
-0.14
à¸
-0.14
etc
-0.13
ÏĤ
-0.13
jours
-0.13
éĸ¢éĢ£
-0.13
POSITIVE LOGITS
whole
0.25
thing
0.24
entire
0.19
fact
0.19
whole
0.18
Whole
0.17
stuff
0.17
thing
0.17
other
0.16
beginning
0.16
Activations Density 0.424%