INDEX
Negative Logits
formulations
0.54
formalism
0.52
consequences
0.47
definitions
0.47
procedures
0.46
algorithms
0.46
umož
0.46
reap
0.46
obfusc
0.46
panels
0.46
POSITIVE LOGITS
hobbies
0.83
любит
0.82
Personality
0.75
性格
0.75
좋아하는
0.75
Hobbies
0.74
Interests
0.74
люблю
0.71
loves
0.71
喜欢
0.68
Activations Density 0.141%