INDEX
Explanations
references to coal-fired power plants and their operational status
New Auto-Interp
Negative Logits
vise
-0.17
Dudley
-0.14
subs
-0.13
wine
-0.13
_OPT
-0.13
vz
-0.13
/tool
-0.13
gars
-0.13
ÑĤи
-0.13
dsl
-0.13
POSITIVE LOGITS
coal
0.32
power
0.31
plants
0.29
coal
0.26
Coal
0.26
plant
0.25
Power
0.25
POWER
0.25
power
0.25
plants
0.25
Activations Density 0.069%