INDEX
Explanations
phrases indicating agreement or confirmation
phrases expressing an opinion or assessment about something
New Auto-Interp
Negative Logits
apsed
-0.76
isin
-0.72
oled
-0.71
uve
-0.70
isner
-0.67
cot
-0.67
keyes
-0.65
Lann
-0.65
jac
-0.64
aredevil
-0.62
POSITIVE LOGITS
louder
0.88
tracks
0.81
Sounds
0.79
lessly
0.79
\\\\\\\\
0.79
omin
0.78
suspic
0.78
vaguely
0.77
sounding
0.77
bite
0.76
Activations Density 0.021%