INDEX
Explanations
phrases related to transportation, specifically involving bikes and trains
New Auto-Interp
Negative Logits
cious
-0.63
capt
-0.58
acqu
-0.58
Ore
-0.56
BILITY
-0.55
interstitial
-0.55
iliar
-0.55
tyr
-0.54
heartbeat
-0.54
ery
-0.54
POSITIVE LOGITS
fitted
1.27
stretched
1.25
ta
1.13
doors
1.09
door
0.99
wards
0.98
smart
0.96
casts
0.91
skirts
0.89
fitting
0.89
Activations Density 0.093%