INDEX
Explanations
proper nouns related to politics and news
New Auto-Interp
Negative Logits
olulu
-0.73
FU
-0.68
Krish
-0.67
ories
-0.66
Beir
-0.66
Celt
-0.65
Tamil
-0.64
Pixie
-0.64
Artemis
-0.62
raints
-0.61
POSITIVE LOGITS
Jr
1.14
's
1.06
Sr
0.97
supporters
0.93
himself
0.93
Jinping
0.92
confid
0.89
omics
0.89
supporter
0.86
III
0.85
Activations Density 3.594%