INDEX
Explanations
references to socioeconomic status and the marginalized in society
New Auto-Interp
Head Attr Weights
0:0.22
1:0.00
2:0.13
3:0.03
4:0.08
5:0.04
6:0.06
7:0.03
8:0.16
9:0.03
10:0.08
11:0.07
Negative Logits
groundwork
-1.83
���
-1.73
ゴン
-1.68
memorandum
-1.67
memo
-1.66
outlining
-1.65
morp
-1.64
estimating
-1.59
checklist
-1.57
forwarding
-1.57
POSITIVE LOGITS
alike
2.62
rier
1.89
rals
1.79
glers
1.77
iers
1.76
urers
1.74
esters
1.72
vertis
1.70
riger
1.67
rers
1.65
Activations Density 0.011%