INDEX
Explanations
phrases prompting to check out something like articles or videos
phrases that encourage checking out or exploring additional content
New Auto-Interp
Negative Logits
enei
-0.73
dictate
-0.71
awake
-0.66
fabrication
-0.65
ajor
-0.64
rightful
-0.63
ylum
-0.63
conviction
-0.63
grasped
-0.63
Fourth
-0.62
POSITIVE LOGITS
whats
0.81
nels
0.68
casts
0.67
how
0.65
icles
0.65
posts
0.63
www
0.62
flows
0.62
Braun
0.61
fitted
0.61
Activations Density 0.024%