INDEX
Explanations
phrases indicating ownership or responsibility
New Auto-Interp
Negative Logits
oward
-0.77
weet
-0.75
burgh
-0.65
Wow
-0.65
ru
-0.65
onica
-0.63
ews
-0.62
ins
-0.62
ileaks
-0.62
Suddenly
-0.61
POSITIVE LOGITS
therefore
1.39
hence
1.30
cannot
1.23
excludes
1.18
thus
1.17
includes
1.16
consequently
1.11
should
1.11
may
1.09
requires
1.09
Activations Density 0.266%