INDEX
Explanations
financial or business-related terms
nouns and concepts related to significant societal and personal issues
New Auto-Interp
Negative Logits
ebin
-0.77
+.
-0.75
attRot
-0.69
%.
-0.66
.).
-0.65
.''.
-0.61
ãĢĤ
-0.58
versa
-0.58
().
-0.58
'.
-0.58
POSITIVE LOGITS
of
0.84
aboard
0.59
weet
0.59
surrounding
0.56
plag
0.56
throughout
0.55
in
0.55
belonging
0.55
outhern
0.54
inside
0.53
Activations Density 0.795%