INDEX
Negative Logits
,
0.52
/
0.44
somewhat
0.44
.
0.43
or
0.43
).
0.43
type
0.42
helpful
0.42
_
0.42
useful
0.42
POSITIVE LOGITS
Każ
0.66
każdej
0.56
Ogni
0.55
Every
0.53
YOU
0.53
Abbiamo
0.53
Think
0.52
NOTHING
0.52
Siamo
0.52
Estamos
0.51
Activations Density 0.003%