INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
Passenger
-0.70
Clicker
-0.67
Pref
-0.65
Unique
-0.64
Resp
-0.64
Surve
-0.63
Keen
-0.63
Secondary
-0.62
Ada
-0.62
Forever
-0.62
POSITIVE LOGITS
jug
0.90
oshenko
0.88
iaz
0.88
boxing
0.80
zel
0.76
rys
0.76
rosis
0.75
oval
0.73
oday
0.72
iatrics
0.72
Activations Density 0.000%
No Known Activations
This feature has no known activations.