INDEX
Explanations
words that introduce actions or instructions
phrases indicating intent or purpose
New Auto-Interp
Negative Logits
pict
-0.71
Appears
-0.70
traded
-0.60
calling
-0.60
itates
-0.58
AAAAAAAA
-0.57
cards
-0.56
disappeared
-0.56
forg
-0.56
comments
-0.56
POSITIVE LOGITS
summarize
1.28
ilet
1.28
complicate
1.27
illustrate
1.23
maximize
1.19
accomplish
1.17
achieve
1.16
facilitate
1.14
compensate
1.14
reiterate
1.12
Activations Density 0.047%