INDEX
Explanations
phrases related to translation and language
New Auto-Interp
Negative Logits
ndra
-0.91
achine
-0.81
--+
-0.81
Gamble
-0.78
oeuv
-0.73
pton
-0.72
ramid
-0.69
appiness
-0.69
pload
-0.68
sbm
-0.68
POSITIVE LOGITS
translation
1.16
translations
1.11
transl
1.04
translator
1.02
subtitles
1.00
Translation
0.97
translation
0.97
interpre
0.89
translated
0.89
interpreter
0.88
Activations Density 0.023%