INDEX
Explanations
phrases or sentences indicating permission or capability
phrases indicating capability or permission
New Auto-Interp
Negative Logits
ãĤ©
-0.76
athan
-0.75
è¦ļéĨĴ
-0.73
ãĥĩãĤ£
-0.71
ascus
-0.70
ãĤ¦ãĤ¹
-0.70
å°Ĩ
-0.69
女
-0.68
lean
-0.68
schild
-0.67
POSITIVE LOGITS
us
1.32
them
1.06
him
0.94
users
0.93
me
0.91
researchers
0.87
consumers
0.82
viewers
0.81
anyone
0.81
investigators
0.80
Activations Density 0.084%