INDEX
Explanations
names of people in political contexts
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
915
+0.09
0.3%
1150
+0.09
0.3%
1001
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
915
+0.09
0.06
227
+0.09
0.06
1957
+0.09
0.03
Negative Logits
increa
-1.16
affor
-1.15
snoopy
-1.14
lyon
-1.11
reluct
-1.08
jacques
-1.07
inev
-1.06
swarovski
-1.06
hcm
-1.05
fta
-1.04
POSITIVE LOGITS
UnifiedTopology
0.58
aides
0.55
aide
0.54
presidential
0.52
ourage
0.51
assistant
0.51
للاسماء
0.51
adviser
0.51
advisers
0.50
rativo
0.50
Activations Density 0.443%