INDEX
Explanations
actions or movements related to physical altercations or confrontations
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.25
1.0%
1842
+0.12
0.5%
690
+0.10
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
736
+0.25
0.11
946
+0.12
0.09
690
+0.10
0.07
Negative Logits
<bos>
-2.43
/***
-0.93
solidar
-0.88
initComponents
-0.86
///**
-0.85
anse
-0.80
dras
-0.76
endwhile
-0.75
endwhile
-0.74
sexu
-0.72
POSITIVE LOGITS
ecru
1.43
affor
1.43
philanth
1.36
swarovski
1.31
impra
1.29
impractica
1.26
unspeak
1.26
disreg
1.25
unlaw
1.25
luxuriant
1.20
Activations Density 1.260%