INDEX
Explanations
words related to the process of purification or treatment
words related to the concept of 'innocence' or 'purity.'
New Auto-Interp
Negative Logits
steps
-0.60
changes
-0.57
grand
-0.56
Wo
-0.56
changing
-0.55
clo
-0.54
Ferguson
-0.54
boot
-0.54
erguson
-0.53
Mog
-0.53
POSITIVE LOGITS
inated
4.32
inate
2.76
inating
2.73
inates
2.68
ination
2.55
inators
1.86
inations
1.85
inately
1.77
inator
1.73
inatory
1.56
Activations Density 0.009%