INDEX
Explanations
phrases or names related to Donald Trump
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1741
+0.14
0.5%
1178
+0.12
0.4%
101
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1178
+0.14
0.08
101
+0.12
0.06
554
+0.12
0.06
Negative Logits
disreg
-0.60
swarovski
-0.57
isolato
-0.50
intersper
-0.50
disgra
-0.49
cowards
-0.49
gaily
-0.49
apprehen
-0.49
migli
-0.49
cushi
-0.49
POSITIVE LOGITS
Trump
1.45
Trump
1.41
trump
1.07
trump
0.85
Donald
0.82
Donald
0.77
donald
0.74
Tr
0.67
trumpet
0.67
Trumpet
0.66
Activations Density 0.132%