INDEX
Explanations
phrases starting with "first"
New Auto-Interp
Negative Logits
'
0.51
?
0.39
'।
0.35
wounds
0.33
chuyển
0.32
’
0.32
]
0.31
pedibusque
0.29
=
0.29
يد
0.29
POSITIVE LOGITS
native
0.41
b
0.40
ink
0.38
essa
0.37
ick
0.36
ن
0.35
add
0.35
uma
0.33
val
0.32
text
0.32
Activations Density 0.332%