INDEX
Explanations
references to aviation and aircraft-related terminology
New Auto-Interp
Negative Logits
akin
-0.18
enci
-0.17
rien
-0.17
embr
-0.16
風
-0.16
æĺŁ
-0.15
hin
-0.15
eration
-0.15
aptop
-0.15
rail
-0.15
POSITIVE LOGITS
/bus
0.22
crash
0.19
Crash
0.17
-readable
0.17
carrier
0.16
foil
0.16
avana
0.16
engines
0.16
crashes
0.16
/train
0.15
Activations Density 0.040%