INDEX
Negative Logits
reasons
0.42
次は
0.42
raisons
0.38
Reasons
0.37
mur
0.37
tenets
0.37
だったので
0.37
considerada
0.37
ниците
0.37
於是
0.36
POSITIVE LOGITS
まったく
0.50
clearly
0.50
definitively
0.49
truly
0.48
Clearly
0.46
まさに
0.46
jelas
0.46
accurately
0.45
exactly
0.43
dokładnie
0.43
Activations Density 0.005%