INDEX
Explanations
phrases related to affirmation or agreement
repetitive phrases indicating confirmation or acknowledgment
New Auto-Interp
Negative Logits
aden
-0.71
ULAR
-0.70
è¦ļéĨĴ
-0.69
odium
-0.65
Mehran
-0.64
ĸļ
-0.63
bloom
-0.61
uated
-0.61
pent
-0.60
cellul
-0.59
POSITIVE LOGITS
eous
1.42
mares
0.93
move
0.93
Stuff
0.81
ward
0.76
wing
0.76
lords
0.76
lander
0.74
Right
0.73
hand
0.71
Activations Density 0.035%