INDEX
Negative Logits
circ
-0.08
fort
-0.07
ACS
-0.07
Yale
-0.07
_save
-0.07
Glo
-0.06
Created
-0.06
Throws
-0.06
Fort
-0.06
.Permission
-0.06
POSITIVE LOGITS
LOB
0.07
!”
0.07
ハ
0.06
0.06
",-
0.06
strands
0.06
мент
0.06
bla
0.06
/http
0.06
saldır
0.06
Activations Density 0.004%