INDEX
Explanations
occurrences of the word "one"
New Auto-Interp
Negative Logits
betweenstory
-1.13
་་
-1.09
Мексичка
-1.07
myſelf
-1.02
للمعارف
-0.98
TestBed
-0.98
Réponses
-0.96
ſind
-0.95
pleaſure
-0.95
fubject
-0.94
POSITIVE LOGITS
one
4.00
One
2.89
one
2.83
One
2.76
ONE
2.69
ONE
2.17
één
2.15
один
2.00
uno
1.93
одного
1.86
Activations Density 0.161%