INDEX
Explanations
terminology and concepts related to language, translation, and interpretation
New Auto-Interp
Negative Logits
ADR
-0.16
enda
-0.15
ime
-0.14
469
-0.14
/routes
-0.14
areth
-0.14
baugh
-0.14
VERR
-0.13
Gan
-0.13
alive
-0.13
POSITIVE LOGITS
nard
0.18
estar
0.16
ards
0.15
ÃŃsk
0.14
û
0.14
ãĥ«ãĥķ
0.14
Osama
0.14
íݸ
0.14
otine
0.14
Franken
0.14
Activations Density 0.076%