INDEX
Explanations
instances of a specific pattern related to "Pr" or "Praise"
New Auto-Interp
Negative Logits
ipay
-0.19
baum
-0.16
elf
-0.16
885
-0.15
897
-0.15
liqu
-0.14
ationship
-0.14
blade
-0.14
endale
-0.14
ê³Ħ
-0.14
POSITIVE LOGITS
chal
0.18
udence
0.18
zem
0.17
%f
0.16
/pr
0.16
Pr
0.15
zy
0.15
akash
0.15
erna
0.15
ullen
0.15
Activations Density 0.022%