INDEX
Explanations
information related to politics and policies
New Auto-Interp
Negative Logits
rament
-0.90
izons
-0.87
kees
-0.83
ottage
-0.82
ocene
-0.79
anooga
-0.79
adena
-0.78
avez
-0.78
oes
-0.77
osit
-0.75
POSITIVE LOGITS
conclud
0.77
repeats
0.77
revolving
0.74
Thomson
0.73
forth
0.73
reiterate
0.72
Transcript
0.72
echoing
0.69
nces
0.68
invoking
0.68
Activations Density 4.963%