INDEX
Explanations
basic, limited, or traditional qualifiers
New Auto-Interp
Negative Logits
аксессу
0.89
фирмы
0.82
ﻼ
0.82
еще
0.79
врач
0.79
выступа
0.79
buen
0.78
высо
0.77
лым
0.77
ویژگی
0.77
POSITIVE LOGITS
lessly
0.84
server
0.81
August
0.81
work
0.78
in
0.75
justice
0.73
d
0.73
sw
0.72
l
0.71
he
0.71
Activations Density 0.003%