INDEX
Explanations
negating conditions or qualities
New Auto-Interp
Negative Logits
失败
0.40
ikke
0.39
exploration
0.38
Few
0.38
मार्गदर्शन
0.37
stanza
0.37
não
0.37
Não
0.36
ที่ไม่
0.36
иной
0.36
POSITIVE LOGITS
necessarily
0.58
overly
0.56
inherently
0.55
allowed
0.54
hin
0.54
currently
0.53
necessarily
0.51
obstante
0.50
orious
0.50
assolutamente
0.49
Activations Density 0.140%