INDEX
Explanations
specific mentions of names, places, and titles, especially focusing on individuals and organizations
specific brand or company names
New Auto-Interp
Negative Logits
theless
-0.74
anwhile
-0.69
pleas
-0.69
stride
-0.67
altogether
-0.67
yourselves
-0.65
Petraeus
-0.64
wise
-0.64
root
-0.62
Warwick
-0.61
POSITIVE LOGITS
ifles
1.11
eworks
1.10
istries
1.08
astery
1.03
Ltd
0.98
uctions
0.97
ifts
0.95
ework
0.94
ipel
0.93
itars
0.92
Activations Density 0.307%