INDEX
Explanations
events or happenings described in a lively manner
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
577
+0.10
0.3%
1507
+0.10
0.3%
390
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1892
+0.10
0.04
1507
+0.10
0.03
390
+0.10
0.04
Negative Logits
liev
-0.98
meras
-0.96
antik
-0.93
maksi
-0.91
pank
-0.83
kac
-0.82
bont
-0.81
kasa
-0.81
optik
-0.81
teras
-0.79
POSITIVE LOGITS
shenan
0.91
prolly
0.90
cushi
0.84
overcrow
0.80
underval
0.76
indeed
0.73
scrat
0.73
disreg
0.72
horrend
0.72
banish
0.72
Activations Density 0.116%