INDEX
Explanations
name entities, particularly known public figures
names of notable individuals, particularly celebrities and public figures
New Auto-Interp
Negative Logits
Els
-0.73
Ire
-0.66
orage
-0.65
ngth
-0.63
rium
-0.63
oard
-0.62
Region
-0.61
itors
-0.60
Georg
-0.59
esville
-0.58
POSITIVE LOGITS
steen
0.88
Jr
0.79
agher
0.77
aka
0.75
famously
0.73
III
0.73
headlined
0.72
QC
0.71
Presents
0.71
gren
0.70
Activations Density 0.205%