INDEX
Explanations
mentions and references to specific names and topics in written content, likely from social media or news articles
New Auto-Interp
Negative Logits
cumbers
-0.80
outsiders
-0.70
embargo
-0.69
exha
-0.68
breadth
-0.68
salient
-0.68
passage
-0.67
carbohyd
-0.67
bree
-0.66
increment
-0.66
POSITIVE LOGITS
_-_
1.30
_
1.28
_(
1.22
_.
1.21
__
1.20
@
1.18
Jr
1.14
()
1.08
/,
0.99
Stud
0.99
Activations Density 0.907%