INDEX
Explanations
actions indicating physical abuse
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1577
+0.32
1.2%
906
+0.23
0.9%
50
+0.18
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
736
+0.32
0.28
946
+0.23
0.18
906
+0.18
-0.04
Negative Logits
manuten
-1.14
aboli
-0.99
susun
-0.98
meras
-0.95
solidar
-0.91
astu
-0.90
Autó
-0.90
consoli
-0.89
CiNii
-0.89
hunde
-0.89
POSITIVE LOGITS
tubercle
0.92
lamella
0.92
friable
0.89
sessile
0.87
blackish
0.86
disreg
0.85
whitish
0.82
papilla
0.79
embodi
0.78
ineffec
0.76
Activations Density 7.909%