INDEX
Explanations
terms related to quality or ranking comparisons
New Auto-Interp
Negative Logits
an
-1.03
o
-0.97
er
-0.95
p
-0.90
k
-0.90
u
-0.86
k
-0.85
r
-0.83
a
-0.83
q
-0.82
POSITIVE LOGITS
myſelf
1.51
greateſt
1.50
beſt
1.48
་་
1.41
itſelf
1.40
iſt
1.38
leaſt
1.37
Majefty
1.37
firſt
1.37
ſelves
1.33
Activations Density 0.071%