INDEX
Explanations
related to historical, political or philosophical discussions, especially regarding debate, democracy, and decision-making
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
849
+0.10
0.3%
437
+0.09
0.3%
900
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
849
+0.10
0.03
1262
+0.09
0.02
437
+0.09
0.02
Negative Logits
Juf
-0.66
Bagdad
-0.62
Fulda
-0.62
Hauptmann
-0.58
Amiens
-0.58
Neum
-0.57
Khart
-0.56
Glou
-0.55
Breslau
-0.55
Chartres
-0.55
POSITIVE LOGITS
">/
0.84
">.
0.82
">...
0.77
">“
0.74
">:
0.69
intitulée
0.67
bonté
0.66
">+
0.65
contribue
0.64
prouve
0.63
Activations Density 0.187%