INDEX
Explanations
words and phrases that convey certainty or emphasis
New Auto-Interp
Negative Logits
Prostit
-0.14
_tF
-0.13
ustry
-0.13
γÏģά
-0.13
itol
-0.12
nt
-0.12
uchos
-0.12
пÑĢоÑĦеÑģÑģионалÑĮ
-0.12
anca
-0.12
oppable
-0.12
POSITIVE LOGITS
uma
0.17
has
0.17
celik
0.17
LY
0.15
nger
0.15
ANNOT
0.14
theless
0.14
-ÑĤаки
0.14
had
0.14
have
0.14
Activations Density 0.324%