INDEX
Explanations
intensifiers that express strong emotions or sentiments
New Auto-Interp
Negative Logits
ighth
-0.15
orio
-0.15
lop
-0.14
pery
-0.14
sure
-0.14
aos
-0.14
|[
-0.14
unky
-0.13
itta
-0.13
&action
-0.13
POSITIVE LOGITS
oper
0.20
glad
0.18
so
0.18
-so
0.17
oo
0.16
OMPI
0.16
OPER
0.16
лей
0.15
unds
0.15
aks
0.15
Activations Density 0.056%