INDEX
Explanations
phrases related to political and social controversies
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1870
+0.10
0.3%
596
+0.10
0.3%
2011
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1778
+0.10
0.05
919
+0.10
0.03
1137
+0.10
0.06
Negative Logits
contex
-0.78
Simult
-0.73
solidar
-0.72
robus
-0.70
récolte
-0.70
rémun
-0.68
metast
-0.68
diffusi
-0.68
logistique
-0.67
hiér
-0.67
POSITIVE LOGITS
of
0.76
krivelse
0.69
pleaf
0.63
caufe
0.62
perfons
0.59
thereof
0.58
ftill
0.58
firft
0.56
occafion
0.56
venit
0.56
Activations Density 0.434%