INDEX
Explanations
dominant themes and constructs within discussions or narratives related to social issues
New Auto-Interp
Head Attr Weights
0:0.05
1:0.03
2:0.05
3:0.10
4:0.06
5:0.11
6:0.02
7:0.02
8:0.35
9:0.05
10:0.07
11:0.04
Negative Logits
emn
-1.62
cooperate
-1.61
gate
-1.55
cooperating
-1.54
imet
-1.53
ographs
-1.52
ahi
-1.51
imeters
-1.50
chairs
-1.49
█
-1.47
POSITIVE LOGITS
nevertheless
2.61
nonetheless
2.53
also
2.29
etheless
2.19
terday
1.94
quickShipAvailable
1.90
withd
1.86
ALSE
1.78
��
1.68
theless
1.66
Activations Density 0.105%