INDEX
Explanations
expressions of positivity and approval
New Auto-Interp
Negative Logits
hadiran
-0.86
caucus
-0.84
osoba
-0.84
myſelf
-0.83
Divina
-0.80
Serif
-0.80
Caucus
-0.80
lern
-0.78
Heuer
-0.78
kään
-0.78
POSITIVE LOGITS
good
1.74
good
1.69
Good
1.68
GOOD
1.67
Good
1.62
GOOD
1.60
Goodwin
1.26
Goodman
1.19
buena
1.05
Bad
0.98
Activations Density 0.063%