INDEX
Explanations
specific names of individuals or organizations
mentions of the name "Starr."
New Auto-Interp
Negative Logits
Moroc
-0.80
orate
-0.77
raltar
-0.74
NCT
-0.72
erala
-0.70
icut
-0.68
icipated
-0.67
itarian
-0.66
tymology
-0.66
gdala
-0.66
POSITIVE LOGITS
Starr
0.81
pling
0.79
ructure
0.75
atic
0.74
wcsstore
0.72
vation
0.72
rano
0.71
pled
0.70
itute
0.69
arts
0.68
Activations Density 0.032%