INDEX
Explanations
names of celebrities
references to awards or notable achievements
New Auto-Interp
Negative Logits
clair
-0.69
interstitial
-0.68
dden
-0.67
alignment
-0.66
lateral
-0.64
LOG
-0.64
removable
-0.63
phyl
-0.62
logged
-0.62
igmatic
-0.61
POSITIVE LOGITS
Rus
2.24
Wins
2.17
Gos
1.83
Won
1.49
Rus
1.17
Bis
1.06
Kaw
1.05
Cars
1.05
Wars
1.00
Krish
0.97
Activations Density 0.014%