INDEX
Explanations
the occurrences of the word "Santa"
New Auto-Interp
Negative Logits
myſelf
-0.90
pleaſure
-0.84
Monfieur
-0.80
poffible
-0.79
himſelf
-0.78
ſeveral
-0.78
Jefus
-0.77
themſelves
-0.76
ſever
-0.75
་་
-0.74
POSITIVE LOGITS
its
0.69
spatial
0.68
LookAnd
0.63
spot
0.62
Spot
0.62
St
0.61
the
0.58
Santa
0.58
Spot
0.58
LikeLike
0.57
Activations Density 0.111%