INDEX
Explanations
expressions of excitement or enthusiasm
New Auto-Interp
Negative Logits
ffect
-0.18
eson
-0.16
casts
-0.16
cast
-0.16
aison
-0.16
eren
-0.15
esian
-0.15
ings
-0.15
oris
-0.15
egan
-0.15
POSITIVE LOGITS
ly
0.20
anticipation
0.20
/import
0.19
ante
0.15
ãĥ¥
0.15
pants
0.15
tir
0.15
prospect
0.15
uar
0.15
ally
0.15
Activations Density 0.019%