INDEX
Explanations
references to a specific word, "Ace"
mentions of specific individuals, particularly those named "Ace" and "Lance"
New Auto-Interp
Negative Logits
urally
-0.93
merce
-0.83
nih
-0.82
ifact
-0.81
atem
-0.80
undai
-0.79
ained
-0.78
asty
-0.77
icism
-0.76
pmwiki
-0.76
POSITIVE LOGITS
vich
0.94
llan
0.86
vine
0.79
holder
0.79
lot
0.78
Daniels
0.77
xit
0.76
fast
0.73
lla
0.72
hold
0.71
Activations Density 0.073%