INDEX
Explanations
references to living beings or entities
New Auto-Interp
Negative Logits
ainfi
-0.79
miniaturka
-0.68
Grüsse
-0.66
autorytatywna
-0.65
dezelve
-0.65
étoit
-0.61
näky
-0.61
pouvoit
-0.61
aikaa
-0.61
anún
-0.59
POSITIVE LOGITS
cs
0.61
CS
0.59
за
0.57
nur
0.56
rein
0.55
ves
0.55
cs
0.54
pilot
0.52
soul
0.51
launch
0.51
Activations Density 0.181%