INDEX
Explanations
mentions of 'Saint' as part of a proper noun
references to "Saint" followed by a place or title
New Auto-Interp
Negative Logits
BIP
-0.88
PT
-0.75
ERG
-0.73
ORN
-0.73
razil
-0.73
PN
-0.71
OTH
-0.69
JUST
-0.67
iferation
-0.66
HO
-0.66
POSITIVE LOGITS
Laurent
0.99
onew
0.89
arts
0.84
Louis
0.83
Clair
0.81
Lucia
0.78
ifully
0.78
clair
0.77
Augustine
0.76
Petersburg
0.75
Activations Density 0.012%