INDEX
Explanations
mentions of sequels, specifically in relation to films or franchises
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.23
0.9%
1491
+0.11
0.5%
31
+0.10
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1491
+0.23
0.02
573
+0.11
0.02
138
+0.10
0.02
Negative Logits
<bos>
-2.48
ⓧ
-0.79
<?
-0.73
<?
-0.68
#
-0.60
/*
-0.60
/*!
-0.58
脚注の使い方
-0.58
onView
-0.57
/*++
-0.57
POSITIVE LOGITS
affor
1.40
maneu
1.37
Czechos
1.30
impra
1.28
increa
1.27
disagre
1.26
accla
1.23
Keny
1.22
Confu
1.22
horrend
1.21
Activations Density 0.088%