INDEX
Explanations
time-related information such as specific dates and events
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.23
1.3%
1472
+0.09
0.5%
200
+0.08
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1472
+0.23
0.06
200
+0.09
0.06
316
+0.08
0.06
Negative Logits
<bos>
-3.20
<?
-1.07
-1.02
ⓧ
-1.01
/**
-0.95
<?
-0.74
/*
-0.72
/***
-0.72
},{
-0.61
#
-0.58
POSITIVE LOGITS
maneu
1.13
affor
1.12
accla
1.10
opport
1.01
maroc
0.98
emphat
0.96
véhic
0.93
fortn
0.91
jawa
0.90
inev
0.90
Activations Density 0.283%