INDEX
Explanations
verbs related to praise or criticism
New Auto-Interp
Negative Logits
yll
-0.14
Angebot
-0.14
ean
-0.14
emento
-0.13
enha
-0.13
ty
-0.13
atk
-0.13
ponde
-0.13
reife
-0.13
incinn
-0.13
POSITIVE LOGITS
kinson
0.17
ÛĮدÙĩ
0.15
·
0.15
Gas
0.15
egin
0.14
Consort
0.14
net
0.14
[]}
0.14
dden
0.14
conven
0.13
Activations Density 0.028%