INDEX
Negative Logits
being
0.74
being
0.70
fiind
0.68
BEING
0.64
Being
0.62
étant
0.60
sendo
0.58
Being
0.56
siendo
0.56
actually
0.46
POSITIVE LOGITS
occasions
0.53
times
0.49
exceptions
0.48
fewer
0.47
differences
0.46
disagreements
0.43
takers
0.43
cases
0.42
able
0.41
room
0.40
Activations Density 0.004%