INDEX
Explanations
phrases indicating quantitative changes or comparisons
New Auto-Interp
Negative Logits
erten
-0.17
enge
-0.14
ncy
-0.14
RoundedRectangle
-0.14
sten
-0.14
SPDX
-0.14
ços
-0.14
ulg
-0.14
нÑĸÑı
-0.13
ков
-0.13
POSITIVE LOGITS
essel
0.16
factors
0.16
factor
0.15
ifa
0.15
964
0.14
olin
0.14
annon
0.14
osa
0.14
amounts
0.14
degrees
0.14
Activations Density 0.023%