INDEX
Explanations
phrases related to legal issues and accusations of wrongdoing
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1499
+0.09
0.2%
1150
+0.08
0.2%
1097
+0.08
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1621
+0.09
0.04
734
+0.08
0.04
1997
+0.08
0.04
Negative Logits
swarovski
-0.96
ecru
-0.83
nutella
-0.72
tupperware
-0.66
paisley
-0.58
glycerin
-0.58
mascarpone
-0.57
getSize
-0.56
burlap
-0.56
oreo
-0.56
POSITIVE LOGITS
said
0.69
says
0.68
says
0.64
said
0.62
&
0.60
alleges
0.59
dovrebbero
0.58
todav
0.58
abbiano
0.57
affatto
0.57
Activations Density 0.189%