INDEX
Explanations
expressions of boredom or repetitiveness
New Auto-Interp
Negative Logits
erdale
-0.17
zzle
-0.16
Ãłi
-0.15
ameleon
-0.14
IMITIVE
-0.14
째
-0.14
912
-0.14
ix
-0.13
eut
-0.13
ado
-0.13
POSITIVE LOGITS
boring
0.18
bored
0.16
émon
0.15
_rsp
0.15
sville
0.15
Rob
0.14
Rolling
0.14
ÙħÙĦØ©
0.14
енно
0.13
acker
0.13
Activations Density 0.030%