INDEX
Explanations
conditional phrases and hypothetical situations
New Auto-Interp
Negative Logits
ba
-0.17
aths
-0.17
athe
-0.16
Guy
-0.16
ano
-0.15
ouser
-0.15
Guy
-0.14
outu
-0.14
enstein
-0.14
berger
-0.14
POSITIVE LOGITS
.fore
0.14
odash
0.14
.bel
0.14
اÙĦاست
0.13
.habbo
0.13
_FC
0.13
à¹Ħà¸Ķ
0.13
åºŃ
0.13
ear
0.13
fresh
0.13
Activations Density 0.037%