INDEX
Negative Logits
你自己
-0.82
兼
-0.75
причем
-0.75
anat
-0.75
интерес
-0.74
нула
-0.72
Naturally
-0.71
AndEndTag
-0.71
selling
-0.71
plut
-0.71
POSITIVE LOGITS
regretted
1.23
regrets
1.23
glad
1.10
regret
1.09
امل
1.05
ahora
1.02
Turns
0.99
arrep
0.95
because
0.93
ended
0.93
Activations Density 0.032%