INDEX
Explanations
key nouns and phrases related to significant entities or themes in discussions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.20
0.7%
1842
+0.11
0.4%
845
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
507
+0.20
0.05
845
+0.11
0.05
227
+0.10
0.07
Negative Logits
<bos>
-2.60
-0.68
-0.67
-0.66
sizeCache
-0.65
-0.65
-0.65
-0.65
InjectMocks
-0.64
-0.62
POSITIVE LOGITS
chrysler
1.99
lamborghini
1.87
increa
1.87
affor
1.86
lidl
1.82
beverly
1.81
isuzu
1.73
fta
1.72
maneu
1.71
stockholm
1.71
Activations Density 0.573%