INDEX
Explanations
people's names and titles, especially in news articles and reports
New Auto-Interp
Negative Logits
yond
-0.56
$.
-0.56
eternity
-0.52
Lear
-0.51
access
-0.50
soType
-0.48
ãĥ¬
-0.47
ãĥĥãĤ¯
-0.47
dden
-0.47
-+
-0.46
POSITIVE LOGITS
meanwhile
1.11
has
1.03
took
1.01
gave
1.01
weighs
0.99
consists
0.99
consisted
0.97
reacted
0.96
went
0.96
recognizes
0.95
Activations Density 1.204%