INDEX
Explanations
proper nouns related to music, celebrities, and social/political figures
references to popular musicians and cultural figures
New Auto-Interp
Negative Logits
stricken
-0.73
Reviewer
-0.72
osponsors
-0.71
guiActiveUnfocused
-0.71
bnb
-0.67
orally
-0.66
HF
-0.62
HM
-0.62
weeney
-0.62
dipping
-0.61
POSITIVE LOGITS
ĭ
0.93
Ĵ
0.92
©
0.92
ª
0.90
ī
0.87
ł
0.86
ĩ
0.86
ĥ
0.85
Ł
0.84
cé
0.84
Activations Density 0.049%