INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
umbn
-0.81
fortun
-0.73
Gale
-0.69
Gutenberg
-0.66
obscurity
-0.65
nomine
-0.64
Philipp
-0.64
funer
-0.63
Playboy
-0.63
Griffith
-0.63
POSITIVE LOGITS
Train
0.86
ÃŁ
0.84
faced
0.76
ski
0.75
per
0.74
rate
0.74
Rate
0.73
orbit
0.73
hyp
0.72
rocket
0.72
Activations Density 0.000%
No Known Activations
This feature has no known activations.