INDEX
Explanations
phrases related to animal welfare and human rights issues
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
227
+0.15
0.4%
1150
+0.11
0.3%
1870
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
227
+0.15
0.07
1499
+0.11
0.05
212
+0.10
0.03
Negative Logits
Deine
-0.78
Dlaczego
-0.78
Wię
-0.78
Więcej
-0.78
Genau
-0.73
Einfach
-0.73
Witaj
-0.72
Uwaga
-0.72
źródło
-0.72
Warto
-0.71
POSITIVE LOGITS
kasa
1.38
kaos
1.23
lele
1.21
makro
1.19
kac
1.18
kase
1.18
traktor
1.13
akut
1.12
mikrofon
1.11
karna
1.08
Activations Density 0.241%