INDEX
Explanations
phrases indicating agreement or acceptance, often accompanied by a pause or reflection
conversational expressions and acknowledgments
New Auto-Interp
Negative Logits
âĵĺ
-0.73
mone
-0.67
phant
-0.66
³
-0.65
stre
-0.64
ļéĨĴ
-0.64
hyde
-0.63
endor
-0.63
hov
-0.62
ache
-0.62
POSITIVE LOGITS
alright
1.02
okay
0.97
maybe
0.95
Lets
0.92
ok
0.91
lets
0.91
enough
0.91
Alright
0.90
let
0.90
Alright
0.89
Activations Density 0.099%