INDEX
Negative Logits
枭
0.42
επισ
0.39
madness
0.39
climbing
0.39
dangerous
0.38
sirven
0.38
hazardous
0.38
embezz
0.38
گزار
0.37
للخ
0.37
POSITIVE LOGITS
behavior
0.43
extensions
0.40
model
0.40
UTCTime
0.38
MODEL
0.38
Behavior
0.37
管制
0.37
ijk
0.37
Behavior
0.37
libert
0.36
Activations Density 0.000%