INDEX
Explanations
descriptions of locations and events, possibly related to news or narratives
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
856
+0.14
0.4%
752
+0.14
0.4%
1967
+0.12
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
939
+0.14
0.07
752
+0.14
0.06
1967
+0.12
0.06
Negative Logits
<bos>
-0.65
Điều
-0.60
Kết
-0.58
Và
-0.57
)$/,
-0.57
>({-0.57
==""){-0.56
ROIT
-0.56
ője
-0.56
\},
-0.56
POSITIVE LOGITS
Gorb
1.44
secon
1.43
guarante
1.39
depic
1.38
intermitt
1.37
Bartholo
1.36
vété
1.35
reluct
1.34
impra
1.33
inev
1.33
Activations Density 0.520%