INDEX
Negative Logits
کر
-1.01
उ
-0.92
beware
-0.88
킹
-0.87
一年
-0.87
🥾
-0.85
unwilling
-0.85
abbr
-0.84
bors
-0.84
зор
-0.84
POSITIVE LOGITS
everything
1.88
relax
1.79
Everything
1.63
reassured
1.57
Relax
1.57
everything
1.52
Everything
1.45
it
1.45
relax
1.45
Relax
1.43
Activations Density 0.027%