INDEX
Negative Logits
drafted
0.88
reviewed
0.85
ஏற்ற
0.80
lò
0.80
गे
0.79
Christendom
0.79
reviewed
0.78
Ду
0.77
પણે
0.77
Quiet
0.76
POSITIVE LOGITS
demonstrations
1.05
holdings
1.04
usage
1.00
trajectories
0.97
Usage
0.97
usages
0.94
usages
0.93
__:
0.90
demonstration
0.89
用法
0.89
Activations Density 0.067%