INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
abal
-0.72
borrow
-0.70
apse
-0.70
fractures
-0.70
bleed
-0.68
ipel
-0.68
erella
-0.67
afa
-0.66
flares
-0.65
robat
-0.64
POSITIVE LOGITS
LEY
0.73
intelligence
0.66
tips
0.63
HUM
0.63
STON
0.63
advertising
0.62
Networks
0.62
INESS
0.61
spoken
0.61
stead
0.61
Activations Density 0.000%
No Known Activations
This feature has no known activations.