INDEX
Explanations
phrases related to legal issues and political actions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
674
+0.11
0.3%
555
+0.11
0.3%
1108
+0.11
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
138
+0.11
0.05
1507
+0.11
0.03
1512
+0.11
0.03
Negative Logits
emphat
-1.13
wherea
-1.05
psg
-1.00
fuf
-1.00
inconce
-0.99
inev
-0.99
fte
-0.98
increa
-0.98
stockholm
-0.98
uniqu
-0.98
POSITIVE LOGITS
Nhưng
0.56
bawat
0.55
Obrázky
0.54
panahon
0.49
RITION
0.48
ibang
0.47
digress
0.47
makeText
0.47
buhay
0.47
fml
0.46
Activations Density 0.291%