INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
Inferno
-0.68
cru
-0.66
Purg
-0.64
favoured
-0.62
yarn
-0.62
urated
-0.61
Queens
-0.61
ports
-0.61
bombers
-0.61
Tanks
-0.60
POSITIVE LOGITS
eanor
0.73
Thumbnail
0.73
igure
0.73
insula
0.73
entary
0.72
ansky
0.71
utive
0.69
"""
0.69
udeb
0.68
Offline
0.68
Activations Density 0.000%
No Known Activations
This feature has no known activations.