INDEX
Explanations
references to God's influence, guidance, and presence in people's lives
New Auto-Interp
Negative Logits
TestBed
-0.47
zol
-0.45
ivant
-0.43
<eos>
-0.42
<strong>
-0.41
Кор
-0.41
Io
-0.40
tee
-0.40
-0.40
mal
-0.40
POSITIVE LOGITS
SBATCH
0.92
Diſ
0.84
ſeveral
0.84
itſelf
0.84
>=",
0.83
BASEPATH
0.78
greateſt
0.76
occaf
0.75
Majefty
0.75
ſmall
0.73
Activations Density 0.183%