INDEX
Explanations
expressions of hope or optimism about future outcomes
New Auto-Interp
Negative Logits
almost
-0.17
anson
-0.17
rike
-0.16
ür
-0.15
neredeyse
-0.15
Almost
-0.15
almost
-0.15
Forget
-0.14
onomy
-0.14
Almost
-0.14
POSITIVE LOGITS
somehow
0.17
($)
0.16
ipay
0.16
è¶³
0.15
enough
0.14
trav
0.14
oux
0.14
esi
0.14
ond
0.13
ì²´
0.13
Activations Density 0.144%