INDEX
Explanations
expressions of belief and confidence
New Auto-Interp
Negative Logits
dea
-0.17
rak
-0.16
alent
-0.15
ãģ°ãģĭãĤĬ
-0.15
lại
-0.15
ymb
-0.15
ĭ
-0.14
utz
-0.14
Ñıз
-0.14
_DETECT
-0.14
POSITIVE LOGITS
strongly
0.28
fully
0.21
lessly
0.20
firmly
0.19
passionately
0.18
ably
0.17
fulness
0.17
firm
0.17
whole
0.16
608
0.16
Activations Density 0.055%