INDEX
Explanations
the word "either" and its variations, indicating a focus on choices or alternatives
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.24
1.0%
2034
+0.11
0.5%
421
+0.10
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
483
+0.24
0.04
9
+0.11
0.03
1608
+0.10
0.03
Negative Logits
<bos>
-2.94
/***
-0.85
Vegeu
-0.82
żdy
-0.65
Externé
-0.62
Välislingid
-0.60
Источники
-0.60
Atsauces
-0.59
/**
-0.58
Související
-0.58
POSITIVE LOGITS
maroc
1.37
stockholm
1.14
cioc
1.11
maneu
1.11
lele
1.10
toscana
1.09
lidl
1.08
cartier
1.07
ananas
1.06
lupo
1.04
Activations Density 0.083%