INDEX
Explanations
references to social media and online locations for engagement
New Auto-Interp
Head Attr Weights
0:0.09
1:0.02
2:0.07
3:0.08
4:0.08
5:0.12
6:0.09
7:0.06
8:0.19
9:0.06
10:0.06
11:0.02
Negative Logits
Slack
-3.11
Volvo
-2.93
Burgess
-2.87
Py
-2.76
Bolivia
-2.73
ktop
-2.70
Peru
-2.68
¶
-2.65
GPL
-2.54
Python
-2.54
POSITIVE LOGITS
NJ
4.72
NJ
4.22
jad
3.30
vance
3.28
synagogue
3.17
Jew
2.93
Rabbi
2.86
Advance
2.81
Rutgers
2.81
Torah
2.70
Activations Density 0.001%