INDEX
Explanations
collective actions and experiences involving groups of people
New Auto-Interp
Negative Logits
aveug
-0.62
ercizio
-0.59
themſelves
-0.59
validamos
-0.55
wrists
-0.53
felling
-0.53
Psyche
-0.53
perſon
-0.52
dersfield
-0.52
Cæsar
-0.52
POSITIVE LOGITS
together
0.69
both
0.69
discussed
0.67
discussion
0.67
I
0.61
he
0.61
juntos
0.60
Both
0.59
discuss
0.58
jointly
0.58
Activations Density 0.367%