INDEX
Explanations
expressions of uncertainty or mild judgment
New Auto-Interp
Negative Logits
singola
-0.60
nicio
-0.54
poitrine
-0.53
charité
-0.52
'\\
-0.52
pauvreté
-0.52
gloire
-0.51
Atoi
-0.50
Endless
-0.49
ouvriers
-0.49
POSITIVE LOGITS
Somewhat
0.94
المعيارى
0.88
extAlignment
0.83
有点
0.83
Kinda
0.82
demografica
0.81
tagext
0.80
Somewhat
0.79
abit
0.79
bit
0.78
Activations Density 0.103%