INDEX
Explanations
mentions of specific geographical locations
references to environmental hazards or concerns
New Auto-Interp
Negative Logits
}.
-0.70
]."
-0.70
)."
-0.70
)).
-0.65
!).
-0.62
therein
-0.61
]).
-0.57
ende
-0.56
>.
-0.56
%).
-0.54
POSITIVE LOGITS
âĵĺ
0.62
iens
0.55
celebrates
0.54
consists
0.53
extends
0.52
greets
0.52
collided
0.51
Association
0.51
reacts
0.50
proposes
0.49
Activations Density 0.803%