INDEX
Explanations
words related to a specific person's name
mentions of specific names or proper nouns
New Auto-Interp
Negative Logits
meet
-0.86
ivities
-0.85
prising
-0.84
orative
-0.82
izons
-0.75
opian
-0.75
ivity
-0.74
utterstock
-0.73
innon
-0.73
unin
-0.71
POSITIVE LOGITS
Ronaldo
0.93
iano
0.85
issance
0.81
zzi
0.79
ivari
0.74
zzo
0.72
ÄŁ
0.68
elli
0.68
ity
0.67
da
0.65
Activations Density 0.022%