INDEX
Explanations
the word "English" in a document
references to the English language
New Auto-Interp
Negative Logits
prus
-0.88
onies
-0.84
apego
-0.83
aunder
-0.83
olls
-0.81
xtap
-0.79
igslist
-0.79
psy
-0.78
utic
-0.76
udeau
-0.76
POSITIVE LOGITS
English
1.07
translation
0.93
English
0.92
translations
0.90
language
0.85
english
0.83
spe
0.80
Dictionary
0.80
Language
0.79
Portuguese
0.79
Activations Density 0.019%