INDEX
Explanations
occurrences of phrases that indicate agreements or conditions
New Auto-Interp
Negative Logits
ule
-0.07
ú
-0.06
op
-0.06
ir
-0.06
cul
-0.06
ullan
-0.06
_PRIORITY
-0.06
YLE
-0.05
sect
-0.05
alam
-0.05
POSITIVE LOGITS
any
0.12
anything
0.10
ä»»ä½ķ
0.10
à¹ĥà¸Ķ
0.09
everything
0.09
all
0.08
qualquer
0.08
ãģĻãģ¹ãģ¦
0.08
any
0.08
.any
0.08
Activations Density 0.004%