INDEX
Explanations
sentences relating to political statements, frustrations, and disappointments
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
674
+0.08
0.2%
1013
+0.08
0.2%
1809
+0.07
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1809
+0.08
0.03
1919
+0.08
0.04
1038
+0.07
0.03
Negative Logits
relenting
-0.53
mistak
-0.51
walang
-0.50
itong
-0.48
maaaring
-0.48
kasama
-0.47
bagay
-0.47
buhay
-0.46
SneakyThrows
-0.46
NameIn
-0.46
POSITIVE LOGITS
hoped
0.92
suppos
0.82
naï
0.75
antici
0.75
promising
0.74
promis
0.73
pessi
0.72
supposed
0.72
patrio
0.69
hopeful
0.69
Activations Density 0.674%