INDEX
Explanations
occurrences of the prefix "dis," indicating a focus on negativity or adverse conditions
New Auto-Interp
Negative Logits
p
-0.16
onga
-0.16
reich
-0.16
vida
-0.15
locked
-0.14
воз
-0.14
alking
-0.14
ailable
-0.14
/downloads
-0.14
able
-0.14
POSITIVE LOGITS
pir
0.23
heart
0.22
son
0.22
yll
0.21
concert
0.20
ses
0.20
quiet
0.19
array
0.18
гаÑĢ
0.17
orient
0.16
Activations Density 0.013%