INDEX
Explanations
verbs followed by punctuation
New Auto-Interp
Negative Logits
)。
0.74
)
0.71
με
0.66
)!
0.62
)
0.60
publicar
0.56
prendere
0.54
ແ
0.54
で
0.54
varietà
0.53
POSITIVE LOGITS
I
0.59
ார்த்த
0.46
gris
0.46
can
0.46
חיל
0.46
sandwiched
0.45
띠
0.44
ílio
0.44
bothers
0.44
tinger
0.43
Activations Density 0.398%