INDEX
Explanations
expressions of strong opinions or emotional reactions
Expressing opinions or reactions
exclamations and sentiments
New Auto-Interp
Negative Logits
laſſen
-0.58
queſta
-0.57
addPreferredGap
-0.56
centralwidget
-0.56
iNdEx
-0.55
LabelTagHelper
-0.54
ujednoznacz
-0.54
⟬
-0.52
arşivlendi
-0.52
ロウィン
-0.52
POSITIVE LOGITS
!
0.44
Signalez
0.42
?
0.40
!!!
0.38
indeed
0.35
?!
0.34
alın
0.33
Cyfeiriadau
0.32
idea
0.32
relâche
0.32
Activations Density 0.219%