INDEX
Explanations
type safe and /p/, /b/ sounds
New Auto-Interp
Negative Logits
sı
0.78
ли
0.75
lymph
0.73
tri
0.72
方針
0.71
сний
0.71
table
0.71
swarm
0.71
्य
0.70
sided
0.70
POSITIVE LOGITS
ক
0.79
A
0.79
b
0.76
N
0.76
O
0.76
და
0.70
0.70
AB
0.70
as
0.70
𝘼
0.69
Activations Density 0.000%