INDEX
Negative Logits
lerinden
0.44
感じ
0.42
lerim
0.40
alloween
0.38
avven
0.38
consider
0.38
lerden
0.38
lerdir
0.38
label
0.37
晟
0.37
POSITIVE LOGITS
+(-
0.84
минус
0.71
÷
0.70
-(-
0.67
Multiplication
0.66
minus
0.62
subtracted
0.62
multiplication
0.61
subtraction
0.61
+(
0.61
Activations Density 0.021%