INDEX
Explanations
instances of phrases related to societal issues and political discourse
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
752
+0.18
0.5%
897
+0.11
0.4%
596
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
752
+0.18
0.06
1334
+0.11
0.05
16
+0.10
0.06
Negative Logits
!...
-0.81
:)))
-0.80
?...
-0.79
ftre
-0.75
maniere
-0.75
»>
-0.74
congr
-0.73
effe
-0.73
ftw
-0.73
ftu
-0.72
POSITIVE LOGITS
viewDid
0.64
createSlice
0.60
contextLoads
0.57
viewWillAppear
0.56
Въ
0.55
遽
0.53
Какво
0.53
adolescence
0.53
ffilm
0.53
InjectMocks
0.52
Activations Density 0.291%