INDEX
Negative Logits
what
0.72
best
0.71
Best
0.63
campsites
0.62
dinner
0.62
autograph
0.61
rendu
0.61
签名
0.61
Best
0.61
justice
0.61
POSITIVE LOGITS
gebruiken
0.92
To
0.86
Do
0.83
To
0.83
Do
0.82
verwenden
0.80
utilizzare
0.78
lat
0.77
사용하는
0.76
Improving
0.76
Activations Density 0.051%