INDEX
Explanations
occurrences of the word "ou" and its variations
New Auto-Interp
Negative Logits
nt
-0.18
ingu
-0.17
AKER
-0.15
thes
-0.15
bruary
-0.15
_unused
-0.15
ilig
-0.15
UnderTest
-0.15
ãĥ¶
-0.14
URES
-0.14
POSITIVE LOGITS
wel
0.15
ivalent
0.15
.leave
0.14
تÙĦ
0.14
uhan
0.14
umont
0.14
Foley
0.13
else
0.13
eras
0.13
dal
0.13
Activations Density 0.017%