INDEX
Explanations
phrases related to finance, specifically in the context of pets and wine regions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
169
+0.11
0.3%
757
+0.10
0.3%
406
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1363
+0.11
0.02
1120
+0.10
0.02
406
+0.09
0.01
Negative Logits
intersper
-0.69
Dziękuję
-0.61
/*!
-0.58
sympathize
-0.56
pushd
-0.53
endeavored
-0.53
Óscar
-0.51
imprison
-0.51
César
-0.51
idolat
-0.50
POSITIVE LOGITS
::
2.48
::
2.42
:::
1.53
::$
1.45
::_
1.44
>::
1.42
:::
1.34
(::
1.32
::~
1.25
::::
1.24
Activations Density 0.226%