INDEX
Explanations
references to political events and community initiatives
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.32
1.4%
381
+0.09
0.4%
2019
+0.09
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1937
+0.32
0.08
289
+0.09
0.06
513
+0.09
0.06
Negative Logits
<bos>
-3.05
/**
-0.64
their
-0.64
<?
-0.62
///**
-0.62
these
-0.62
/***
-0.61
public
-0.61
his
-0.60
this
-0.60
POSITIVE LOGITS
accla
1.49
affor
1.48
lidl
1.42
maneu
1.36
impra
1.33
excru
1.30
stockholm
1.30
increa
1.29
sovere
1.28
unden
1.27
Activations Density 2.450%