INDEX
Explanations
phrases related to political events and statements
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1150
+0.15
0.5%
382
+0.14
0.4%
2019
+0.14
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1445
+0.15
0.04
613
+0.14
0.04
382
+0.14
0.04
Negative Logits
hastly
-0.71
nera
-0.70
cabrio
-0.65
chod
-0.63
appartement
-0.61
maroc
-0.61
lastonbury
-0.61
appartamento
-0.59
sprocket
-0.58
camere
-0.58
POSITIVE LOGITS
Præ
0.76
Souha
0.73
Că
0.72
whither
0.71
NUKAT
0.69
intrigu
0.68
gaily
0.67
Ră
0.67
vagu
0.66
%\[
0.65
Activations Density 0.125%