INDEX
Explanations
mentions of specific political and governmental terms and concepts
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
764
+0.15
0.4%
227
+0.10
0.3%
1253
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
678
+0.15
0.06
1327
+0.10
0.04
1984
+0.10
0.05
Negative Logits
delà
-0.66
ATEGORY
-0.64
paravant
-0.63
fince
-0.63
insuffisamment
-0.61
leaft
-0.59
whofe
-0.59
"..\..\..\
-0.59
esterday
-0.58
follic
-0.58
POSITIVE LOGITS
successor
0.64
newList
0.56
newName
0.55
new
0.54
unified
0.53
someday
0.53
revamped
0.52
newData
0.52
semblables
0.52
comprehensive
0.52
Activations Density 0.456%