INDEX
Explanations
significant and impactful verbs or actions that indicate intent or emotion
New Auto-Interp
Negative Logits
adx
-0.16
agram
-0.16
vat
-0.16
ãĥ¬ãĥĥãĥĪ
-0.15
psy
-0.14
----------------------------------------------------------------------------------------------------------------
-0.14
/pm
-0.14
anske
-0.14
arel
-0.13
iges
-0.13
POSITIVE LOGITS
lero
0.17
esser
0.16
vla
0.16
ắn
0.15
å¼ı
0.15
Recommended
0.14
utenberg
0.14
ãĥijãĥ³
0.14
/ext
0.14
fore
0.14
Activations Density 0.020%