INDEX
Explanations
information related to politics, government, and current events
New Auto-Interp
Negative Logits
Bridge
-0.69
Applic
-0.63
Moves
-0.62
ilts
-0.61
rendition
-0.59
Adin
-0.58
revisions
-0.57
merger
-0.57
legal
-0.57
Pos
-0.56
POSITIVE LOGITS
yourself
1.35
yourselves
1.25
your
0.83
wondering
0.81
Yourself
0.79
Tube
0.77
hear
0.73
agine
0.73
YOUR
0.72
guessed
0.72
Activations Density 2.434%