INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
Bentley
-0.75
buckle
-0.71
finite
-0.70
auto
-0.64
clot
-0.63
grasp
-0.62
const
-0.61
grad
-0.61
ctor
-0.60
integer
-0.60
POSITIVE LOGITS
edom
0.87
Gork
0.77
ylum
0.77
lins
0.77
irth
0.75
iment
0.70
alsa
0.69
inical
0.68
terness
0.67
risome
0.66
Activations Density 0.000%
No Known Activations
This feature has no known activations.