INDEX
Explanations
expressions of anticipation and future intentions
New Auto-Interp
Negative Logits
ÅĦst
-0.17
ondon
-0.15
fucked
-0.14
_MSB
-0.14
ibold
-0.14
æ§
-0.14
BOVE
-0.14
ÑģÑĤÑİ
-0.14
chine
-0.14
kie
-0.13
POSITIVE LOGITS
again
0.25
future
0.25
future
0.22
Future
0.18
next
0.18
uhl
0.17
Again
0.16
Future
0.16
again
0.16
afx
0.16
Activations Density 0.106%