INDEX
Explanations
reported speech, particularly phrases that attribute statements or opinions to individuals
New Auto-Interp
Negative Logits
bum
-0.15
att
-0.15
alike
-0.15
aira
-0.14
aging
-0.14
ep
-0.14
bum
-0.14
alam
-0.13
ande
-0.13
LOS
-0.13
POSITIVE LOGITS
dụ
0.15
longleftrightarrow
0.15
asca
0.15
.scalablytyped
0.15
Dün
0.14
aepernick
0.14
644
0.14
kový
0.14
aben
0.14
hma
0.13
Activations Density 0.040%