INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
rete
-0.89
llo
-0.82
onde
-0.78
Pamela
-0.71
Venus
-0.71
Ramsay
-0.70
Minerva
-0.69
Lily
-0.67
ueless
-0.67
Machina
-0.67
POSITIVE LOGITS
checkpoints
0.72
intel
0.70
competitive
0.70
choke
0.69
peanuts
0.69
gotten
0.66
slack
0.66
sting
0.65
dissent
0.63
pinch
0.62
Activations Density 0.000%
No Known Activations
This feature has no known activations.