INDEX
Explanations
negations and expressions of lack or absence
New Auto-Interp
Negative Logits
zioni
-0.15
yu
-0.14
ntax
-0.14
hof
-0.14
inç
-0.14
ÑĢаÑĤи
-0.14
ä¹ĥ
-0.14
ras
-0.14
adder
-0.13
rani
-0.13
POSITIVE LOGITS
yet
0.40
Yet
0.31
yet
0.30
Yet
0.30
slightest
0.23
any
0.23
any
0.21
ä»»ä½ķ
0.20
haven
0.20
ANY
0.19
Activations Density 0.053%