INDEX
Explanations
phrases related to societal or cultural norms
references to societal or cultural norms
New Auto-Interp
Negative Logits
Savings
-0.76
Investments
-0.68
afia
-0.65
ulhu
-0.64
eries
-0.64
Sear
-0.63
Giles
-0.63
ierce
-0.61
azar
-0.61
County
-0.60
POSITIVE LOGITS
norm
3.95
norm
2.28
Norm
1.88
Norm
1.73
norms
1.72
normal
1.49
normal
1.44
normative
1.28
Normal
1.25
horizont
1.17
Activations Density 0.008%