INDEX
Explanations
phrases indicating comparisons or approximations of quantities
New Auto-Interp
Negative Logits
βο
-0.17
illis
-0.17
assed
-0.16
agne
-0.15
urst
-0.15
oom
-0.14
.allocate
-0.14
allet
-0.14
edia
-0.14
ataka
-0.14
POSITIVE LOGITS
dozens
0.20
aits
0.17
altogether
0.16
myriad
0.16
many
0.15
çľ¾
0.15
TF
0.15
dozen
0.15
Principle
0.14
riere
0.14
Activations Density 0.084%