INDEX
Explanations
phrases that prompt the reader to consider different perspectives or framing of concepts
instances of the phrase "think of."
New Auto-Interp
Negative Logits
pite
-0.70
promoter
-0.61
ollah
-0.60
ante
-0.60
proclaimed
-0.59
Written
-0.59
maintained
-0.59
differed
-0.58
Tweet
-0.58
Found
-0.58
POSITIVE LOGITS
èĪ
0.69
buck
0.63
agine
0.63
agus
0.62
quitting
0.61
abel
0.61
è£ıè
0.61
agram
0.61
gil
0.60
swer
0.60
Activations Density 0.054%