INDEX
Negative Logits
mandato
0.52
Mam
0.50
malu
0.48
Fitting
0.48
Mam
0.48
malo
0.48
සං
0.48
হাওয়া
0.48
挠
0.47
grief
0.47
POSITIVE LOGITS
improvements
0.65
Improvements
0.64
improvements
0.63
improvement
0.57
Verbesser
0.54
Improvements
0.54
Improve
0.54
改善
0.50
improvement
0.50
Improvement
0.49
Activations Density 0.070%