INDEX
Explanations
quantifiers and modifiers indicating degree or extent
New Auto-Interp
Negative Logits
lessly
-0.16
edBy
-0.15
icated
-0.15
irim
-0.15
нÑĥв
-0.14
deÅŁ
-0.14
âĹĦ
-0.14
duÄŁunu
-0.13
asionally
-0.13
olle
-0.13
POSITIVE LOGITS
dozens
0.18
hundreds
0.17
Torch
0.15
iec
0.15
thousands
0.15
illions
0.15
of
0.15
minor
0.14
any
0.14
many
0.14
Activations Density 0.104%