INDEX
Explanations
proper nouns and specific terms related to art, film, and literature
New Auto-Interp
Negative Logits
âĹ¼
-0.87
Boone
-0.71
upt
-0.69
ask
-0.67
realDonaldTrump
-0.66
oke
-0.65
Clever
-0.65
mater
-0.65
hip
-0.63
ç¥ŀ
-0.60
POSITIVE LOGITS
Fighters
0.93
ruary
0.91
letcher
0.91
ruits
0.89
luent
0.85
ornia
0.83
andom
0.81
ocused
0.80
ergus
0.80
iture
0.79
Activations Density 11.079%