INDEX
Explanations
second-person descriptions of actions or characteristics
instances of the word "you"
New Auto-Interp
Negative Logits
İĭ
-0.73
ector
-0.69
course
-0.68
SpaceEngineers
-0.67
theirs
-0.66
duty
-0.62
dimension
-0.61
odor
-0.60
OIL
-0.60
Mission
-0.58
POSITIVE LOGITS
're
1.49
've
1.21
hear
1.04
arrive
0.97
compare
0.94
realize
0.93
contemplate
0.92
realise
0.91
combine
0.89
consider
0.89
Activations Density 0.094%