INDEX
Explanations
phrases related to future events
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
122
+0.13
0.5%
897
+0.13
0.5%
757
+0.13
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
757
+0.13
0.03
1035
+0.13
0.02
1183
+0.13
0.02
Negative Logits
⌥
-0.46
Boli
-0.45
Schuster
-0.43
Lindley
-0.43
ineffec
-0.42
obstinate
-0.42
hashlib
-0.42
Rivas
-0.42
Schreiber
-0.41
Diener
-0.41
POSITIVE LOGITS
ahead
1.11
ahead
1.10
AHEAD
1.08
Ahead
1.07
Ahead
0.97
appartamento
0.74
styleType
0.69
proprietario
0.69
ritratto
0.68
compleanno
0.67
Activations Density 0.051%