INDEX
Explanations
words following punctuation and quotes
New Auto-Interp
Negative Logits
দুর্দান্ত
0.58
excelente
0.53
फटाफट
0.53
Dacă
0.51
we
0.51
отлично
0.50
大好き
0.50
sayesinde
0.50
magari
0.50
धमाकेदार
0.50
POSITIVE LOGITS
controversy
0.65
attempts
0.64
controversial
0.61
Controversy
0.58
controversies
0.57
disputed
0.54
attempts
0.54
controvers
0.54
近年
0.53
Attempts
0.53
Activations Density 0.009%