INDEX
Explanations
names and proper nouns
names with non-Western origins
New Auto-Interp
Negative Logits
anún
-0.69
berdayakan
-0.68
mejores
-0.66
pantalón
-0.65
gafas
-0.62
pouvoit
-0.62
llorando
-0.61
ainfi
-0.60
prohibido
-0.60
dangereux
-0.59
POSITIVE LOGITS
Leary
0.57
AxisAlignment
0.55
expandindo
0.55
躇
0.54
Donnell
0.52
asanjo
0.50
Lynx
0.48
aarrggbb
0.46
TSM
0.45
umn
0.45
Activations Density 0.084%