INDEX
Negative Logits
Fol
0.38
endorsed
0.37
Accuracy
0.36
buffered
0.36
বক্তব্যের
0.36
fol
0.36
Concern
0.36
Ln
0.36
decouple
0.36
là
0.35
POSITIVE LOGITS
openly
0.61
smack
0.58
ative
0.53
ativeness
0.52
abiert
0.51
turkey
0.50
nonsense
0.50
ATIVE
0.50
Smack
0.48
shop
0.47
Activations Density 0.023%