INDEX
Explanations
mentions of various types of aircraft and related terms
New Auto-Interp
Negative Logits
æĺŁ
-0.16
venues
-0.16
dera
-0.16
ally
-0.15
_iters
-0.15
風
-0.15
enci
-0.15
hin
-0.15
phia
-0.15
rien
-0.15
POSITIVE LOGITS
/bus
0.19
-readable
0.19
crash
0.17
wright
0.17
/train
0.16
arium
0.16
foil
0.16
avana
0.16
olk
0.16
Crash
0.16
Activations Density 0.027%