INDEX
Explanations
phrases related to politics and government
New Auto-Interp
Negative Logits
ortium
-0.72
Beir
-0.72
thening
-0.70
inki
-0.63
fty
-0.62
optic
-0.59
Apost
-0.56
Shipping
-0.56
identally
-0.55
brow
-0.55
POSITIVE LOGITS
wright
1.20
ername
1.05
plays
1.04
wr
0.89
played
0.88
GROUND
0.88
Piano
0.86
poker
0.84
testers
0.84
testing
0.83
Activations Density 3.480%