INDEX
Explanations
negative and contrasting statements
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
674
+0.10
0.3%
1974
+0.09
0.3%
1262
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1262
+0.10
0.04
1974
+0.09
0.04
47
+0.09
0.04
Negative Logits
Ibidem
-0.51
Simult
-0.49
BeforeMethod
-0.47
Vys
-0.47
AppCompatTheme
-0.46
CompoundButton
-0.46
InstrumentedTest
-0.44
Hän
-0.44
Opis
-0.44
Enlarged
-0.43
POSITIVE LOGITS
necessarily
0.57
necesar
0.54
autorytatywna
0.49
but
0.48
rewsbury
0.48
jajaja
0.47
但不
0.47
ghz
0.46
ongevity
0.46
burberry
0.46
Activations Density 0.130%