INDEX
Explanations
references to religious figures and miraculous events
New Auto-Interp
Negative Logits
866
-0.15
ุษ
-0.14
solitary
-0.13
K
-0.13
ä¸įå®ī
-0.13
.Euler
-0.13
.backup
-0.13
obre
-0.13
emento
-0.13
ardin
-0.13
POSITIVE LOGITS
miracle
0.63
miracles
0.61
mir
0.55
Miracle
0.53
Mir
0.50
miraculous
0.49
Mir
0.49
wonders
0.45
mirac
0.42
wonder
0.41
Activations Density 0.201%