INDEX
Explanations
keywords related to environmental and societal impacts
New Auto-Interp
Head Attr Weights
0:0.03
1:0.02
2:0.08
3:0.35
4:0.07
5:0.03
6:0.03
7:0.03
8:0.05
9:0.06
10:0.09
11:0.11
Negative Logits
ptoms
-1.77
ichever
-1.68
theless
-1.60
etheless
-1.60
��
-1.59
notations
-1.56
dding
-1.52
Ire
-1.51
untled
-1.51
whatever
-1.50
POSITIVE LOGITS
..."
2.31
.","
1.97
…"
1.87
."
1.83
.")
1.78
");
1.73
Phen
1.66
";
1.64
Replay
1.64
\",
1.63
Activations Density 0.029%