INDEX
Explanations
specific text characters, likely related to a particular language or encoding
specific hyphenated words or terms related to political context
New Auto-Interp
Negative Logits
Zot
-0.64
Admir
-0.64
Introduced
-0.63
Brill
-0.63
Cruiser
-0.62
trumpet
-0.59
Breath
-0.59
Booker
-0.58
cart
-0.56
Newsletter
-0.55
POSITIVE LOGITS
etry
0.79
opian
0.78
agog
0.77
opic
0.77
itary
0.72
ampton
0.71
ét
0.69
ogn
0.69
oki
0.68
cially
0.68
Activations Density 0.119%