INDEX
Explanations
references to violent actions or crimes involving physical harm to individuals
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1150
+0.12
0.3%
1385
+0.10
0.3%
1984
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1984
+0.12
0.04
946
+0.10
0.04
736
+0.10
0.04
Negative Logits
Sqft
-0.70
meras
-0.69
kemer
-0.66
AnchorStyles
-0.64
bitat
-0.62
ekos
-0.61
postIndex
-0.60
plis
-0.59
quí
-0.59
\{\\-0.58
POSITIVE LOGITS
carrefour
0.83
désert
0.83
suivie
0.81
jurassic
0.78
joyeux
0.77
fameux
0.77
mystère
0.75
Yeet
0.75
Wtf
0.74
triomphe
0.74
Activations Density 0.183%