INDEX
Explanations
references to a specific person's actions and discussions surrounding high-profile events or political situations
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
405
+0.16
0.7%
1376
+0.15
0.6%
1350
+0.13
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
405
+0.16
0.06
1376
+0.15
0.04
1994
+0.13
0.03
Negative Logits
Queste
-0.62
Altri
-0.61
Infatti
-0.59
Beskrivning
-0.59
Mentre
-0.56
vedno
-0.56
Inoltre
-0.55
pravi
-0.55
Ciò
-0.55
tidaknya
-0.54
POSITIVE LOGITS
COME
1.18
come
1.16
Come
1.15
Come
1.14
come
1.12
COME
0.99
COMES
0.95
Kome
0.87
comes
0.86
comes
0.84
Activations Density 0.117%