INDEX
Explanations
comparative phrases that indicate a degree of comparison or superiority
New Auto-Interp
Negative Logits
Fcn
-0.16
лаб
-0.15
261
-0.15
844
-0.15
outu
-0.14
818
-0.14
訳
-0.14
389
-0.14
sgi
-0.14
еÑī
-0.14
POSITIVE LOGITS
":"'
0.14
DET
0.14
á»ķ
0.14
Deniz
0.14
Crossing
0.13
ToMany
0.13
zbek
0.13
öy
0.13
Rudd
0.13
ech
0.13
Activations Density 0.027%