INDEX
Explanations
phrases related to development and planning
New Auto-Interp
Head Attr Weights
0:0.02
1:0.02
2:0.05
3:0.06
4:0.10
5:0.03
6:0.06
7:0.42
8:0.03
9:0.02
10:0.07
11:0.06
Negative Logits
ances
-1.53
NOTICE
-1.53
emo
-1.42
jee
-1.39
EE
-1.38
owan
-1.38
izens
-1.38
notice
-1.35
PLIED
-1.34
ACTED
-1.34
POSITIVE LOGITS
anew
1.79
Sov
1.59
castles
1.58
foundations
1.56
effic
1.52
dens
1.52
stronger
1.51
fort
1.51
resilience
1.50
surviv
1.49
Activations Density 0.051%