INDEX
Explanations
words related to identity and affiliation
New Auto-Interp
Negative Logits
aurus
-0.16
berra
-0.16
adays
-0.16
opoulos
-0.15
οι
-0.15
-vous
-0.15
odore
-0.14
anale
-0.14
away
-0.14
AILABLE
-0.14
POSITIVE LOGITS
umes
0.17
ures
0.17
DN
0.15
ité
0.14
isc
0.14
_tooltip
0.14
Åį
0.14
BCM
0.14
tbl
0.14
adi
0.13
Activations Density 0.493%