INDEX
Explanations
phrases indicating problems or challenges
New Auto-Interp
Negative Logits
favour
-0.15
favor
-0.14
inne
-0.14
ãĥ³ãĥķ
-0.14
AYER
-0.14
postcode
-0.14
Ed
-0.14
213
-0.14
/cal
-0.14
kın
-0.14
POSITIVE LOGITS
opak
0.19
cht
0.17
trap
0.16
mé
0.16
ance
0.15
ifes
0.15
bjerg
0.15
ival
0.15
clus
0.15
ares
0.15
Activations Density 0.381%