INDEX
Explanations
instances of political pressure and lobbying activities
New Auto-Interp
Negative Logits
warts
-0.19
longleftrightarrow
-0.15
eldorf
-0.15
Revenge
-0.15
QUE
-0.14
icipant
-0.14
Prompt
-0.14
reira
-0.14
ustos
-0.13
Prompt
-0.13
POSITIVE LOGITS
pressure
0.42
pressure
0.34
Pressure
0.34
-pressure
0.32
lobbying
0.32
Pressure
0.30
effort
0.29
pressures
0.28
efforts
0.27
appeals
0.26
Activations Density 0.097%