INDEX
Explanations
concepts related to the future and sustainability
New Auto-Interp
Head Attr Weights
0:0.34
1:0.04
2:0.08
3:0.04
4:0.05
5:0.06
6:0.03
7:0.02
8:0.14
9:0.05
10:0.04
11:0.04
Negative Logits
resa
-1.64
attm
-1.59
Debor
-1.54
andestine
-1.51
Rita
-1.47
answ
-1.42
pas
-1.42
Bagg
-1.41
ktop
-1.41
:]
-1.40
POSITIVE LOGITS
faster
1.89
Faster
1.87
vs
1.82
slower
1.80
louder
1.78
less
1.76
improve
1.66
fewer
1.64
Better
1.62
�
1.62
Activations Density 0.013%