INDEX
Explanations
names related to famous individuals
references to prominent individuals and entities, particularly those related to music and sports
New Auto-Interp
Negative Logits
lights
-0.79
liness
-0.69
chell
-0.69
Nature
-0.69
loo
-0.69
cat
-0.67
equity
-0.63
inki
-0.62
eki
-0.62
chens
-0.61
POSITIVE LOGITS
ENTS
0.98
asted
0.86
illery
0.83
antly
0.81
ents
0.81
oug
0.80
IVES
0.79
ORED
0.78
issance
0.78
asting
0.78
Activations Density 0.084%