INDEX
Explanations
names and references related to politics and specific individuals
proper nouns and names associated with individuals
New Auto-Interp
Negative Logits
itol
-0.72
©¶æ
-0.71
crunch
-0.66
achev
-0.65
amic
-0.64
artificially
-0.64
icipated
-0.62
ij士
-0.60
guiActiveUn
-0.60
estinal
-0.59
POSITIVE LOGITS
schild
1.02
orthy
0.78
conn
0.73
Redditor
0.73
orld
0.72
aii
0.72
merce
0.71
enegger
0.71
WARD
0.70
maker
0.68
Activations Density 0.234%