INDEX
Explanations
text related to political figures and events, especially regarding statements and reactions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1150
+0.10
0.3%
678
+0.09
0.3%
261
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
261
+0.10
0.02
678
+0.09
0.04
1799
+0.09
0.02
Negative Logits
principalColumn
-0.60
Ause
-0.57
躇
-0.54
PLWABN
-0.52
kulum
-0.51
ViewInit
-0.49
hasNext
-0.48
Карьера
-0.47
MongoClient
-0.47
NUKAT
-0.47
POSITIVE LOGITS
intersper
0.95
impra
0.80
muna
0.78
ineffec
0.76
jaya
0.76
unavoid
0.75
disreg
0.74
uninten
0.74
maneu
0.74
Punj
0.74
Activations Density 0.321%