INDEX
Explanations
phrases related to adding to something or a particular cause
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1983
+0.10
0.3%
1023
+0.09
0.3%
1967
+0.09
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1218
+0.10
0.03
1023
+0.09
0.03
1443
+0.09
0.02
Negative Logits
ecru
-0.58
FetchType
-0.50
shewn
-0.49
confider
-0.49
defire
-0.48
Cringe
-0.47
whofe
-0.47
illance
-0.47
/**
-0.47
ftill
-0.47
POSITIVE LOGITS
Ukraina
0.53
karna
0.52
repertoire
0.52
Kanada
0.51
existing
0.51
toz
0.50
arsenal
0.47
febru
0.47
aggiunto
0.47
adds
0.47
Activations Density 0.170%