INDEX
Explanations
references to the Pope
references to the pope
New Auto-Interp
Negative Logits
iable
-0.76
elong
-0.75
productive
-0.72
nerg
-0.70
Miner
-0.68
Cooperative
-0.68
iance
-0.67
Mines
-0.67
eeds
-0.66
srf
-0.66
POSITIVE LOGITS
pope
1.33
pont
1.04
iscopal
0.85
ashore
0.76
bishop
0.75
icular
0.74
franc
0.72
ocalypse
0.72
Pope
0.71
romeda
0.70
Activations Density 0.011%