INDEX
Explanations
names or words associated with individuals
proper nouns, particularly names and locations
New Auto-Interp
Negative Logits
krit
-0.71
Kislyak
-0.65
tremend
-0.64
yip
-0.64
doi
-0.63
counsel
-0.61
indebted
-0.59
Kenobi
-0.58
ensional
-0.58
Bethesda
-0.58
POSITIVE LOGITS
onne
0.81
reau
0.80
odor
0.77
utical
0.77
ohydrate
0.76
ohyd
0.75
heid
0.73
inho
0.72
nect
0.72
ades
0.72
Activations Density 0.098%