INDEX
Explanations
references to the speaker's personal experiences and thoughts
instances of the phrase "I have" followed by various nouns or experiences
New Auto-Interp
Negative Logits
rift
-0.65
icking
-0.64
ipping
-0.63
osion
-0.62
eem
-0.62
allows
-0.60
oppable
-0.58
artney
-0.58
pport
-0.57
weed
-0.56
POSITIVE LOGITS
been
1.10
heard
1.05
watched
1.01
noticed
1.00
seen
1.00
listened
0.98
personally
0.98
myself
0.97
NEVER
0.94
witnessed
0.94
Activations Density 0.130%