INDEX
Negative Logits
鯊
0.42
BufOffset
0.40
경상
0.40
weathered
0.39
साहस
0.39
splitter
0.39
ሽ
0.39
rocked
0.38
irregular
0.38
rock
0.38
POSITIVE LOGITS
authoritarian
0.85
surveillance
0.83
totalitarian
0.81
coercive
0.77
Surveillance
0.77
control
0.76
controlling
0.75
control
0.75
控制
0.75
контроль
0.71
Activations Density 0.429%