INDEX
Explanations
mentions of shocking or noteworthy events, possibly involving crime or controversy
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1445
+0.16
0.5%
1177
+0.12
0.4%
1741
+0.11
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1445
+0.16
0.03
1217
+0.12
0.02
1858
+0.11
0.02
Negative Logits
hairc
-0.90
despotism
-0.88
tupperware
-0.85
ecru
-0.81
newArr
-0.81
cushi
-0.80
newList
-0.79
newVal
-0.79
swarovski
-0.79
philosophic
-0.79
POSITIVE LOGITS
autorytatywna
0.88
<bos>
0.75
expandindo
0.69
はじめに
0.68
mdash
0.62
ongiorno
0.58
During
0.57
hammad
0.57
When
0.56
<>",
0.55
Activations Density 0.099%