INDEX
Explanations
proper nouns, specifically names of people and places
academic references or citations
New Auto-Interp
Negative Logits
acebook
-0.74
illion
-0.72
roximately
-0.68
canon
-0.68
oaded
-0.67
unity
-0.67
jriwal
-0.66
terday
-0.66
Bound
-0.65
olar
-0.64
POSITIVE LOGITS
et
1.05
supra
0.99
itsch
0.86
pp
0.85
Associates
0.84
JA
0.77
Jr
0.75
unpublished
0.71
MA
0.71
Epidem
0.71
Activations Density 0.113%