INDEX
Negative Logits
distinguishes
0.96
distinguish
0.89
differentiates
0.83
distinctions
0.78
differs
0.74
angan
0.73
nedenle
0.71
が増
0.71
diminishes
0.70
আলাদা
0.69
POSITIVE LOGITS
1.35
1.16
1.13
1.00
0.98
0.96
0.93
0.92
0.85
0.85
Activations Density 0.007%