INDEX
Explanations
tokens and phrases related to numerical or quantitative information
New Auto-Interp
Negative Logits
nab
-0.16
bal
-0.15
compens
-0.15
æĬľ
-0.15
soc
-0.15
AS
-0.15
edom
-0.14
tar
-0.14
ing
-0.14
as
-0.14
POSITIVE LOGITS
eca
0.19
ĺ
0.19
agu
0.16
ÑĴ
0.16
ãĢģãĢģ
0.15
openh
0.15
égor
0.15
usat
0.15
zin
0.15
-Identifier
0.14
Activations Density 0.006%