INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
idden
-0.87
tnc
-0.85
proxy
-0.80
rolled
-0.76
role
-0.76
ertodd
-0.75
STATE
-0.74
voice
-0.73
UI
-0.73
uum
-0.72
POSITIVE LOGITS
charism
0.82
pour
0.72
rake
0.71
Pole
0.70
Rabbi
0.69
recapt
0.69
Yon
0.68
mort
0.68
Pwr
0.67
Loft
0.65
Activations Density 0.000%
No Known Activations
This feature has no known activations.