INDEX
Explanations
references to contests or competitions
New Auto-Interp
Negative Logits
odb
-0.17
боÑĢ
-0.16
bure
-0.15
ehir
-0.15
barrier
-0.15
Barrier
-0.14
borg
-0.14
Kır
-0.14
боÑĢа
-0.14
bab
-0.14
POSITIVE LOGITS
Ben
1.74
Ben
1.58
ben
1.41
ben
1.34
BEN
1.14
Benjamin
1.10
benz
0.94
Benz
0.90
benign
0.82
بÙĨ
0.78
Activations Density 0.058%