INDEX
Explanations
phrases that express definition or assertion of identity
New Auto-Interp
Negative Logits
thon
-0.18
bose
-0.17
undy
-0.17
ully
-0.17
ött
-0.16
ical
-0.15
th
-0.15
ndl
-0.15
etas
-0.15
age
-0.15
POSITIVE LOGITS
ally
0.18
phá»ij
0.16
inea
0.14
utive
0.14
Ưá»
0.14
.gameserver
0.13
ialect
0.13
atsu
0.13
amt
0.13
crown
0.13
Activations Density 0.020%