INDEX
Negative Logits
breaking
0.91
prevent
0.75
preventive
0.74
broken
0.74
prevention
0.74
active
0.73
మాట్లాడు
0.72
あ
0.72
对了
0.70
personal
0.70
POSITIVE LOGITS
obliged
1.54
complied
1.48
oblige
1.42
comply
1.34
complies
1.24
obeyed
1.21
complying
1.20
dutiful
1.16
hesitated
1.15
obedient
1.15
Activations Density 0.096%