INDEX
Explanations
the word "absolutely" with high activation values
strong affirmations or positive assertions
New Auto-Interp
Negative Logits
coh
-0.69
Skydragon
-0.66
mole
-0.66
squ
-0.64
é¾
-0.64
creeping
-0.63
quarters
-0.63
æł
-0.62
oresc
-0.62
Ó
-0.61
POSITIVE LOGITS
Absolutely
0.86
ogether
0.85
Mine
0.79
True
0.78
Absolutely
0.78
Wrong
0.75
Yes
0.74
Definitely
0.72
Possibly
0.72
leans
0.71
Activations Density 0.023%