INDEX
Explanations
information related to controversial or debated topics, specifically focusing on legislation and policy decisions regarding drugs
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1499
+0.11
0.3%
1177
+0.10
0.3%
939
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
939
+0.11
0.06
1499
+0.10
0.06
1501
+0.09
0.04
Negative Logits
nutella
-0.70
Quinoa
-0.67
Cześć
-0.64
Noice
-0.63
pentru
-0.63
volon
-0.62
amigurumi
-0.60
Hahah
-0.59
Xoxo
-0.59
Lmfao
-0.58
POSITIVE LOGITS
ekos
0.86
panik
0.84
praktik
0.82
prakti
0.80
kafe
0.80
optik
0.78
kosme
0.78
kriminal
0.78
silikon
0.77
protokol
0.76
Activations Density 0.483%