INDEX
Explanations
instances where the phrase "you" or "I" is followed by a verb
instances of the phrase "the first time" or references to past experiences
New Auto-Interp
Negative Logits
meantime
-0.70
now
-0.64
retaining
-0.61
utical
-0.58
provisional
-0.58
urity
-0.58
nowhere
-0.58
Prepar
-0.57
redients
-0.57
Dhabi
-0.57
POSITIVE LOGITS
ever
1.08
EVER
1.05
interacted
1.04
ventured
1.00
encountered
1.00
bothered
0.99
touched
0.98
interfered
0.97
encount
0.97
visited
0.97
Activations Density 0.155%