INDEX
Explanations
phrases related to historical events or reflections on the past
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1387
+0.13
0.4%
1678
+0.12
0.4%
757
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1387
+0.13
0.04
938
+0.12
0.04
401
+0.12
0.03
Negative Logits
pymysql
-0.55
philips
-0.50
configureStore
-0.48
smtplib
-0.47
salomon
-0.47
zipfile
-0.47
turkish
-0.46
djang
-0.45
shutil
-0.45
ępnie
-0.45
POSITIVE LOGITS
Past
1.04
past
1.01
past
1.01
PAST
1.00
Past
0.98
PAST
0.94
Paste
0.63
الحره
0.61
gusted
0.60
astrous
0.53
Activations Density 0.062%