INDEX
Explanations
mentions of petrol or gasoline
New Auto-Interp
Negative Logits
g
-0.16
éļ
-0.15
1
-0.14
au
-0.14
Flynn
-0.14
Observable
-0.14
loom
-0.14
iqu
-0.14
Euro
-0.14
arding
-0.13
POSITIVE LOGITS
bish
0.17
hardt
0.17
adora
0.16
bud
0.16
kdir
0.16
onnen
0.15
_NPC
0.15
ascript
0.15
Giang
0.14
spd
0.14
Activations Density 0.006%