INDEX
Explanations
terms related to legal or political issues
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1081
+0.13
0.4%
2034
+0.10
0.3%
1967
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1081
+0.13
0.05
1372
+0.10
0.03
1473
+0.10
0.02
Negative Logits
préc
-0.76
renfer
-0.71
convenable
-0.68
Cześć
-0.65
tantôt
-0.65
quoique
-0.65
élar
-0.64
muito
-0.63
aussitôt
-0.62
nettement
-0.61
POSITIVE LOGITS
idr
0.62
estekak
0.62
moiselle
0.61
durs
0.61
embley
0.56
lele
0.56
optik
0.55
alps
0.55
rrggbb
0.54
ensacola
0.53
Activations Density 0.173%