INDEX
Explanations
references to trains and railroad-related terms
New Auto-Interp
Negative Logits
gham
-0.65
sevi
-0.63
Dorado
-0.63
bersi
-0.62
Tanto
-0.61
جوايز
-0.60
tory
-0.59
isks
-0.58
Siro
-0.57
Wolfe
-0.56
POSITIVE LOGITS
trains
1.45
Train
1.45
train
1.43
Trains
1.35
TRAIN
1.34
Trains
1.30
Train
1.29
trains
1.23
TRAIN
1.20
train
1.16
Activations Density 0.085%