INDEX
Negative Logits
there
0.78
saddened
0.74
unhappy
0.73
enticing
0.68
وتی
0.66
encouraging
0.65
accessible
0.65
异
0.65
disheart
0.64
ㅓ
0.64
POSITIVE LOGITS
wir
0.94
diri
0.85
pped
0.76
ulsory
0.75
meningkat
0.75
dirinya
0.72
skyrock
0.71
già
0.71
sustancias
0.71
ؘ
0.71
Activations Density 0.000%