INDEX
Explanations
phrases where the speaker expresses personal opinions or experiences
phrases indicating personal experiences or opinions
New Auto-Interp
Negative Logits
substitutes
-0.66
pires
-0.65
Schwar
-0.61
matter
-0.58
unfocusedRange
-0.58
signals
-0.57
surplus
-0.57
resumes
-0.57
redients
-0.56
disadvantages
-0.56
POSITIVE LOGITS
've
1.57
'm
1.12
'd
1.06
ever
1.04
know
0.98
RL
0.97
saw
0.96
have
0.90
encountered
0.90
HAVE
0.90
Activations Density 0.071%