INDEX
Negative Logits
Assistance
0.25
Which
0.24
shopping
0.23
و
0.23
సహాయ
0.23
Out
0.22
وم
0.21
assistance
0.21
What
0.21
Research
0.20
POSITIVE LOGITS
ensure
0.36
solidify
0.34
illustrate
0.32
ในการ
0.31
dictate
0.31
overcome
0.31
differentiate
0.30
distinguish
0.30
justify
0.30
exemplify
0.29
Activations Density 0.008%