INDEX
Explanations
mentions of religious figures and locations like nuns, Sisters, and convents
references to nuns and their roles in various contexts
New Auto-Interp
Negative Logits
à¸
-0.72
eters
-0.66
azing
-0.65
ozy
-0.64
activity
-0.64
ðĿ
-0.63
resp
-0.62
HAM
-0.61
ERG
-0.61
=-=-=-=-
-0.61
POSITIVE LOGITS
nuns
1.06
nery
1.01
nun
0.97
convent
0.89
Sisters
0.87
Sister
0.84
choir
0.82
cess
0.82
Lucia
0.80
ipation
0.78
Activations Density 0.017%