INDEX
Explanations
comparative language indicating degree or quantity
New Auto-Interp
Negative Logits
allon
-0.16
ziel
-0.16
ibox
-0.14
createUrl
-0.14
autoreleasepool
-0.13
rende
-0.13
lauf
-0.13
outh
-0.13
Clarence
-0.13
æĦĽ
-0.13
POSITIVE LOGITS
than
0.28
-than
0.23
than
0.22
Than
0.21
Than
0.20
THAN
0.20
_than
0.20
rather
0.18
Rather
0.18
rather
0.17
Activations Density 0.124%