INDEX
Explanations
negative statements with emphasis on denial
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
381
+0.11
0.3%
605
+0.10
0.3%
1974
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1974
+0.11
0.05
1288
+0.10
0.05
392
+0.10
0.03
Negative Logits
Iš
-0.67
Į
-0.63
Bár
-0.59
Municipio
-0.59
CommonModule
-0.58
optik
-0.57
WebElementEntity
-0.57
Võ
-0.56
Ngb
-0.55
Cár
-0.55
POSITIVE LOGITS
vodi
0.66
letti
0.64
pensieri
0.61
affez
0.60
koš
0.60
najbol
0.59
nemici
0.59
vestiti
0.59
unlaw
0.59
zima
0.59
Activations Density 0.125%