INDEX
Explanations
proper nouns consisting of the letters "Sha"
mentions of specific names and prominent individuals
New Auto-Interp
Negative Logits
anwhile
-0.88
ktop
-0.87
etheless
-0.82
gran
-0.81
olesc
-0.75
ications
-0.72
ollower
-0.71
icted
-0.71
ancy
-0.70
ĨĴ
-0.70
POSITIVE LOGITS
IELD
0.86
roud
0.78
ning
0.73
rike
0.71
pler
0.71
ned
0.70
IFT
0.69
ful
0.69
ivas
0.68
pling
0.68
Activations Density 0.058%