INDEX
Explanations
capitalization patterns, specifically focused on the term "Col" followed by a number
New Auto-Interp
Negative Logits
longleftrightarrow
-0.16
asurer
-0.15
allas
-0.15
pecies
-0.14
emes
-0.14
Ø®ÙĬ
-0.14
ienia
-0.14
ete
-0.14
волÑı
-0.14
dos
-0.14
POSITIVE LOGITS
lected
0.30
col
0.26
oured
0.24
LECT
0.23
Col
0.23
ored
0.23
gate
0.23
ombo
0.23
onna
0.21
ours
0.21
Activations Density 0.016%