INDEX
Explanations
references to local news and community engagement in the context of health-related issues
New Auto-Interp
Negative Logits
â̦
-0.44
[â̦]
-0.35
â̦.
-0.33
..
-0.29
[â̦
-0.26
â̦.
-0.26
â̦
-0.23
âĢIJ
-0.23
"..
-0.22
:[
-0.21
POSITIVE LOGITS
-
0.41
-↵
0.36
(ph
0.35
-↵↵
0.33
...↵
0.32
,...↵
0.30
-,
0.27
...↵↵
0.25
,...↵↵
0.24
-(
0.22
Activations Density 0.004%