INDEX
Negative Logits
নকে
0.78
hiszen
0.72
совершен
0.71
निरंतर
0.71
mlp
0.71
celebration
0.70
无论是
0.70
maß
0.69
harmonious
0.68
осуществления
0.67
POSITIVE LOGITS
usually
1.06
usually
1.02
meestal
0.97
instructions
0.87
biasanya
0.82
Usually
0.82
াতিক
0.81
把你
0.80
vary
0.80
varies
0.79
Activations Density 0.775%