INDEX
Explanations
specific mentions of locations or organizations in official contexts
references to specific entities or organizations
New Auto-Interp
Negative Logits
/"
-0.78
udder
-0.72
?).
-0.72
—"
-0.70
cum
-0.70
?),
-0.70
="/
-0.69
___
-0.68
Adds
-0.68
pless
-0.67
POSITIVE LOGITS
resa
1.08
importance
1.00
integrity
1.00
totality
0.99
seriousness
0.98
utmost
0.97
slightest
0.95
timing
0.94
intention
0.93
urgency
0.93
Activations Density 0.294%