INDEX
Explanations
references to roads and driving-related terms
references to roads and road-related terminology
New Auto-Interp
Negative Logits
arians
-0.78
tle
-0.75
volent
-0.71
uating
-0.65
Eps
-0.63
Wynne
-0.62
Mons
-0.62
ascript
-0.62
uates
-0.62
ears
-0.61
POSITIVE LOGITS
blocks
1.37
trip
1.27
block
1.26
show
1.23
ways
1.16
side
1.11
map
1.11
kill
1.06
runner
1.06
maps
1.05
Activations Density 0.037%