INDEX
Explanations
quality and descriptive adjectives
New Auto-Interp
Negative Logits
biofilms
0.44
ایمان
0.41
συνεχ
0.41
opioids
0.41
mengakses
0.38
subTest
0.37
lagoons
0.36
biofilm
0.36
œuvres
0.36
ノン
0.35
POSITIVE LOGITS
Can
0.43
al
0.42
ing
0.41
un
0.41
fine
0.40
can
0.40
man
0.39
va
0.38
class
0.38
reply
0.38
Activations Density 0.000%