INDEX
Explanations
modal verbs and auxiliaries
New Auto-Interp
Negative Logits
updating
0.38
ஒரு
0.36
update
0.36
include
0.35
ya
0.35
itself
0.34
underside
0.34
params
0.34
neath
0.34
an
0.34
POSITIVE LOGITS
themselves
0.47
flock
0.44
وطالبات
0.43
ktorí
0.43
получают
0.42
ಿದ್ದಾರೆ
0.41
této
0.40
kteří
0.40
často
0.40
纷纷
0.38
Activations Density 0.165%