INDEX
Explanations
occurrences of the letter "y" in various contexts
New Auto-Interp
Negative Logits
<eos>
-0.64
↵↵
-0.62
.
-0.61
↵
-0.55
__":
-0.51
“
-0.49
,
-0.49
...
-0.49
DDE
-0.48
"
-0.47
POSITIVE LOGITS
y
1.97
y
1.45
Y
1.32
𝑦
1.02
𝐲
0.99
Monfieur
0.98
𝙮
0.97
y
0.96
Majefty
0.95
𝚢
0.95
Activations Density 0.207%