INDEX
Negative Logits
“When
-0.08
"When
-0.07
“And
-0.07
Jul
-0.07
“How
-0.06
controls
-0.06
Workflow
-0.06
<l
-0.06
ιακ
-0.06
where
-0.06
POSITIVE LOGITS
famous
0.10
Famous
0.08
-known
0.07
風
0.07
 ̄ ̄ ̄
0.07
infamous
0.07
herald
0.06
Associated
0.06
Tel
0.06
/report
0.06
Activations Density 0.017%