INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
eker
-0.66
fertil
-0.66
atum
-0.64
package
-0.63
load
-0.62
guise
-0.61
ieves
-0.60
squeeze
-0.59
packages
-0.58
drift
-0.57
POSITIVE LOGITS
boxing
0.74
Bers
0.71
Saber
0.69
erb
0.68
burg
0.68
ij士
0.67
aml
0.65
icip
0.65
lehem
0.65
Browne
0.64
Activations Density 0.000%
No Known Activations
This feature has no known activations.