INDEX
Explanations
famous names
names of well-known individuals, particularly celebrities and public figures
New Auto-Interp
Negative Logits
Els
-0.77
Cth
-0.67
orage
-0.66
oard
-0.64
Agric
-0.62
ework
-0.61
ĪĴ
-0.60
req
-0.60
Sek
-0.60
Christie
-0.59
POSITIVE LOGITS
famously
1.07
attends
1.02
joked
0.96
himself
0.92
Himself
0.92
Jr
0.91
greets
0.90
tweeted
0.88
testified
0.84
aka
0.81
Activations Density 0.236%