INDEX
Explanations
references to people being taken somewhere or involving locations and actions taken towards them
instances of action or status updates regarding individuals or events
New Auto-Interp
Negative Logits
incent
-0.79
detecting
-0.76
describ
-0.76
frontline
-0.75
speeding
-0.72
reluct
-0.71
prosec
-0.70
honoured
-0.70
quir
-0.69
periodic
-0.68
POSITIVE LOGITS
Reviewer
1.31
.?
1.20
Loading
1.17
.-
1.15
âĸł
1.14
SOURCE
1.12
Article
1.11
[_
1.10
RAW
1.10
Advertisements
1.10
Activations Density 0.131%