INDEX
Explanations
references to political discussion topics
New Auto-Interp
Negative Logits
ãĥŃãĥ¼
-0.17
endale
-0.16
BootApplication
-0.16
ůr
-0.15
stro
-0.15
mousemove
-0.15
ControlEvents
-0.14
ichten
-0.14
ory
-0.14
lest
-0.14
POSITIVE LOGITS
topic
0.54
topics
0.53
Topic
0.47
topic
0.43
discussion
0.43
Topics
0.42
Topic
0.40
topics
0.40
-topic
0.40
discuss
0.39
Activations Density 0.219%