INDEX
Explanations
phrases that express simplicity, familiarity, or common experiences
New Auto-Interp
Negative Logits
tagHelper
-0.68
zwar
-0.65
httphttps
-0.63
ailleurs
-0.60
also
-0.60
igens
-0.59
Nonnull
-0.58
certainly
-0.58
également
-0.58
esm
-0.58
POSITIVE LOGITS
Simplemente
0.84
simplesmente
0.83
simply
0.76
einfach
0.76
Просто
0.76
Просто
0.76
prostu
0.75
simplemente
0.74
Simply
0.71
zwy
0.69
Activations Density 0.244%