INDEX
Explanations
warning messages regarding graphic content
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
468
+0.11
0.3%
906
+0.10
0.3%
497
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
497
+0.11
0.04
468
+0.10
0.05
1285
+0.09
0.05
Negative Logits
shenan
-1.05
snoopy
-0.95
apprehen
-0.93
maneu
-0.89
milf
-0.88
ineffec
-0.87
kasa
-0.84
attemp
-0.84
mischie
-0.84
vagu
-0.83
POSITIVE LOGITS
graphic
0.59
horror
0.58
gruesome
0.56
gross
0.54
trauma
0.54
traumatic
0.52
腥
0.51
horrific
0.50
shock
0.49
cada
0.49
Activations Density 0.730%