INDEX
Explanations
phrases or sentences that prompt the reader to take an action
instances of the word "the" and similar contextual phrases
New Auto-Interp
Negative Logits
iatus
-0.78
tsy
-0.71
ufact
-0.67
warts
-0.65
enegger
-0.64
Constantin
-0.63
llor
-0.61
Baghd
-0.61
daughter
-0.60
emetery
-0.60
POSITIVE LOGITS
levers
0.88
sparing
0.86
hashtag
0.82
borrowed
0.78
pseudonym
0.78
newfound
0.77
tools
0.77
tactic
0.76
metaphor
0.73
analogy
0.72
Activations Density 0.214%