INDEX
Negative Logits
worry
0.41
policy
0.39
ww
0.38
referencing
0.38
author
0.37
inherited
0.37
outages
0.37
WWII
0.36
upl
0.35
ovanja
0.35
POSITIVE LOGITS
Dependent
0.42
聪
0.42
عندك
0.41
الجرس
0.40
Dependent
0.38
炳
0.38
弦
0.37
عندي
0.37
Insertion
0.36
طرق
0.35
Activations Density 0.000%