INDEX
Explanations
adjectives that indicate significance, quality, or intensity
New Auto-Interp
Negative Logits
etro
-0.18
conc
-0.16
anan
-0.16
داد
-0.14
conc
-0.14
Gee
-0.14
Conc
-0.14
utos
-0.13
egan
-0.13
вÑĢоп
-0.13
POSITIVE LOGITS
ế
0.16
å¿ľ
0.15
esen
0.15
Watt
0.15
than
0.15
gis
0.14
THAN
0.14
ocos
0.14
illery
0.14
bbox
0.14
Activations Density 0.082%