INDEX
Explanations
phrases indicating permission or restrictions
New Auto-Interp
Negative Logits
McGu
-0.15
Kosten
-0.15
zman
-0.15
ifecycle
-0.15
ross
-0.15
anium
-0.14
oproject
-0.14
atham
-0.14
Bew
-0.14
غاÙĦ
-0.14
POSITIVE LOGITS
mention
0.42
mentions
0.30
mention
0.29
Mention
0.27
ment
0.27
mentioned
0.25
mentioning
0.24
forget
0.23
mentioned
0.21
worry
0.21
Activations Density 0.008%