INDEX
Explanations
information related to historical events and biographical details of individuals, potentially with a focus on specific organizations or foundations
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
394
+0.15
0.5%
1699
+0.13
0.4%
2034
+0.13
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1843
+0.15
0.03
764
+0.13
0.04
394
+0.13
0.04
Negative Logits
dichi
-0.85
źródło
-0.82
Ótimo
-0.79
Ży
-0.79
voleva
-0.78
Dziękuję
-0.78
Przyp
-0.77
troppo
-0.75
Muitos
-0.75
Pued
-0.75
POSITIVE LOGITS
</h2>
0.70
</h3>
0.67
Undoubtedly
0.67
↵↵
0.67
</h1>
0.65
:
0.63
</strong>
0.62
Whilst
0.61
↵
0.60
</h4>
0.60
Activations Density 0.336%