INDEX
Explanations
phrases related to political and governmental topics
elements related to political terminology and discussions
New Auto-Interp
Negative Logits
tein
-0.56
repl
-0.52
omorphic
-0.52
compos
-0.51
diplom
-0.48
happiest
-0.48
hindsight
-0.47
photographic
-0.47
lingu
-0.46
repeating
-0.46
POSITIVE LOGITS
ities
0.73
iance
0.67
action
0.64
izes
0.64
ittee
0.60
ariat
0.60
ies
0.60
ics
0.59
ogy
0.59
iaz
0.58
Activations Density 0.686%