INDEX
Explanations
terms and phrases related to certainty and conditionality
New Auto-Interp
Negative Logits
aret
-0.15
884
-0.15
BT
-0.14
itesse
-0.14
Shade
-0.14
083
-0.14
erus
-0.14
metic
-0.13
cone
-0.13
rew
-0.13
POSITIVE LOGITS
ymm
0.18
art
0.17
enth
0.16
hev
0.16
@api
0.15
баз
0.15
Cas
0.14
bas
0.14
GD
0.14
edl
0.14
Activations Density 0.009%