INDEX
Explanations
words that convey negativity or dismissal towards concepts and ideas
New Auto-Interp
Negative Logits
trib
-0.14
upa
-0.14
ëĦ¤ìĿ´íĬ¸
-0.14
/apt
-0.14
ilty
-0.13
ĮĢ
-0.13
ylko
-0.13
appa
-0.13
resden
-0.13
.tpl
-0.13
POSITIVE LOGITS
unless
0.19
unless
0.18
.LookAndFeel
0.16
altogether
0.16
avigation
0.15
æİī
0.15
because
0.15
ROL
0.15
enever
0.14
ALS
0.14
Activations Density 0.132%