INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
ACTIONS
-0.78
venge
-0.73
idepress
-0.66
lishes
-0.64
4090
-0.64
consensual
-0.63
ãĤ
-0.61
derivative
-0.60
spec
-0.59
ç¥ŀ
-0.58
POSITIVE LOGITS
burgh
0.75
Bulgar
0.71
ancers
0.65
halla
0.64
Antiqu
0.64
EH
0.64
Tribe
0.64
Undead
0.63
Balk
0.63
Shape
0.63
Activations Density 0.000%
No Known Activations
This feature has no known activations.