INDEX
Explanations
descriptions of physical sensations and emotions
expressions of empathy and emotional experiences
New Auto-Interp
Negative Logits
condu
-0.73
Preferences
-0.68
accordingly
-0.67
enthal
-0.65
criteria
-0.65
objectionable
-0.65
Austral
-0.65
arten
-0.64
defic
-0.64
Topic
-0.64
POSITIVE LOGITS
witnessing
0.93
numb
0.90
flashbacks
0.85
remembering
0.84
watching
0.80
waking
0.79
knowing
0.78
hearing
0.76
surrounded
0.76
imagining
0.76
Activations Density 0.590%