INDEX
Explanations
frequent occurrences of the word "the"
Tokens preceding nouns or titles
the followed by specific nouns
New Auto-Interp
Negative Logits
sanitaires
-0.68
sauvages
-0.67
mukaan
-0.66
löytyy
-0.65
čierna
-0.65
sienta
-0.64
aquilo
-0.63
braccia
-0.63
mellett
-0.62
esetén
-0.60
POSITIVE LOGITS
same
0.96
")));
0.94
"]}
0.89
']}
0.89
latter
0.89
)];
0.88
".
0.86
entire
0.86
following
0.85
"]
0.81
Activations Density 0.153%