INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
birds
-0.82
species
-0.76
dogs
-0.75
dog
-0.75
bird
-0.71
esting
-0.69
arning
-0.68
ingly
-0.68
athering
-0.67
chest
-0.66
POSITIVE LOGITS
Redux
0.70
arresting
0.64
Mumbai
0.62
ihar
0.62
aults
0.61
oho
0.60
ocide
0.60
fortun
0.60
onz
0.59
*/(
0.59
Activations Density 0.000%
No Known Activations
This feature has no known activations.