INDEX
Explanations
words related to alphabets and symbols
occurrences of a specific character or symbol
New Auto-Interp
Negative Logits
WARD
-0.71
Coliseum
-0.69
waves
-0.69
commute
-0.68
wards
-0.67
ITNESS
-0.66
rush
-0.65
microw
-0.64
iflower
-0.64
Neural
-0.64
POSITIVE LOGITS
Å
1.41
¼
1.32
½
1.23
ı
1.18
ĭ
1.12
ĵ
1.12
¾
1.12
ł
1.09
ĥ
1.09
»
1.09
Activations Density 0.007%