INDEX
Explanations
phrases indicating addition or inclusion of information
imperative verbs suggesting action or instructions
New Auto-Interp
Negative Logits
emale
-0.67
displayText
-0.67
PATH
-0.63
wheelchair
-0.62
REF
-0.60
PRES
-0.60
mith
-0.60
bottleneck
-0.59
SUP
-0.59
specified
-0.58
POSITIVE LOGITS
ings
0.90
yourselves
0.77
anon
0.72
atio
0.72
antha
0.72
lihood
0.71
Fasc
0.69
aring
0.69
Quote
0.69
Your
0.68
Activations Density 0.173%