INDEX
Explanations
phrases where the speaker is explaining or defining something
repeated phrases emphasizing personal experience or perspective
New Auto-Interp
Negative Logits
unsolved
-0.88
gettable
-0.83
Reviewer
-0.77
efeated
-0.73
uries
-0.72
MFT
-0.72
PDATE
-0.65
iscovered
-0.63
itaire
-0.62
awaits
-0.61
POSITIVE LOGITS
mean
1.63
meant
1.45
referring
1.45
implying
1.34
imply
1.33
means
1.31
refers
1.15
refer
1.12
referencing
1.10
meaning
1.09
Activations Density 0.376%