INDEX
Explanations
references to rankings or placements in competitive contexts
New Auto-Interp
Negative Logits
Dere
-0.17
Sesso
-0.16
.Router
-0.15
ãĥ³ãĥ
-0.15
nt
-0.14
ega
-0.14
Hind
-0.14
ãĥ³ãĥĶ
-0.14
iplina
-0.14
nge
-0.14
POSITIVE LOGITS
ienes
0.14
omas
0.14
ichen
0.14
ibox
0.14
veyor
0.13
itsu
0.13
erre
0.13
î¡
0.13
pron
0.13
yang
0.13
Activations Density 0.061%