INDEX
Explanations
phrases with first person plural pronouns referring to a collective action
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1262
+0.11
0.3%
1415
+0.09
0.3%
1919
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1415
+0.11
0.09
1919
+0.09
0.10
1262
+0.09
0.07
Negative Logits
°;
-0.66
lagar
-0.63
timeType
-0.63
serons
-0.60
monaster
-0.60
travaillons
-0.58
saluti
-0.58
€€
-0.57
abbra
-0.56
fosfor
-0.56
POSITIVE LOGITS
ourselves
0.69
we
0.66
we
0.65
We
0.63
resurre
0.63
We
0.59
socialists
0.57
shenan
0.56
glimp
0.56
hear
0.54
Activations Density 0.388%