INDEX
Negative Logits
htt
-0.71
士
-0.61
sclerosis
-0.57
ishable
-0.55
Contra
-0.55
Telecommunications
-0.54
CVE
-0.54
deter
-0.54
Plain
-0.53
overse
-0.53
POSITIVE LOGITS
ificial
1.64
ifacts
1.52
icles
1.49
illery
1.47
ifact
1.46
isan
1.43
isans
1.26
istical
1.22
emis
1.21
ific
1.21
Activations Density 0.028%