INDEX
Explanations
proper nouns or names of companies/organizations
introductory phrases and words that indicate emphasis or specifics
New Auto-Interp
Negative Logits
ãĢĤ
-0.63
?).
-0.62
ðŁĻĤ
-0.61
Annotations
-0.58
.*
-0.57
ãĤĭ
-0.57
ãĢĮ
-0.56
.).
-0.56
↵Âł
-0.56
----------------
-0.56
POSITIVE LOGITS
withstanding
1.03
resa
1.02
xiety
0.91
icularly
0.89
%"
0.89
usterity
0.87
[
0.86
odore
0.86
ctions
0.84
ircraft
0.83
Activations Density 0.256%