INDEX
Explanations
terms related to physical body parts
New Auto-Interp
Negative Logits
mask
-0.62
ween
-0.59
Niet
-0.59
flush
-0.59
masks
-0.59
elig
-0.58
Homo
-0.58
fer
-0.57
awake
-0.57
Woodward
-0.56
POSITIVE LOGITS
ageddon
1.43
aceutical
1.37
ovie
1.11
ament
1.04
ichael
1.03
essage
1.02
ony
1.02
achine
0.98
strong
0.92
onica
0.92
Activations Density 0.014%