INDEX
Negative Logits
UCE
-0.07
bondage
-0.07
boundary
-0.06
Cic
-0.06
stry
-0.06
mattered
-0.06
깨
-0.06
Suz
-0.06
_keys
-0.06
スク
-0.06
POSITIVE LOGITS
report
0.11
Report
0.11
Report
0.09
reports
0.09
_report
0.08
report
0.08
보고
0.08
_Report
0.07
سات
0.07
ても
0.07
Activations Density 0.022%