INDEX
Negative Logits
没有什么
0.46
purposefully
0.41
intentionally
0.40
precau
0.39
🤨
0.38
provoqu
0.38
expres
0.37
unmistak
0.37
volut
0.37
deliberately
0.36
POSITIVE LOGITS
blindly
1.21
rely
1.11
relying
1.07
Rely
0.99
solely
0.98
reliance
0.90
reliance
0.86
relied
0.85
blind
0.80
relies
0.80
Activations Density 0.118%