INDEX
Explanations
phrases or terms with the structure "think of" followed by a particular concept or idea
repetitive phrases that suggest a pattern of thought or reflection
New Auto-Interp
Negative Logits
soever
-0.67
pite
-0.65
contended
-0.63
declared
-0.61
cess
-0.60
ante
-0.60
proclaimed
-0.60
Mp
-0.60
contained
-0.60
asserts
-0.60
POSITIVE LOGITS
agine
0.66
ways
0.66
Hein
0.63
quitting
0.62
agus
0.61
how
0.61
hetically
0.60
din
0.60
them
0.59
èĪ
0.59
Activations Density 0.056%