INDEX
Explanations
adjectives indicating quality or value
positive descriptive qualities
New Auto-Interp
Negative Logits
itself
-0.60
is
-0.56
itself
-0.53
enfans
-0.46
it
-0.45
betreft
-0.44
ularity
-0.43
frontières
-0.43
inconvénients
-0.43
sekal
-0.42
POSITIVE LOGITS
bezeichneter
0.61
egne
0.56
ulike
0.56
تقاوى
0.54
additional
0.52
__*/
0.52
various
0.51
脚注の使い方
0.51
色んな
0.50
Meilleures
0.50
Activations Density 0.500%