INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
unemploy
-0.80
ilitary
-0.76
hire
-0.75
holding
-0.73
enegger
-0.73
reinstated
-0.71
hired
-0.69
unemployed
-0.68
ally
-0.65
uner
-0.64
POSITIVE LOGITS
ICLE
0.86
Brush
0.78
BUG
0.77
ipeg
0.77
minster
0.71
ĺħ
0.67
Thumbnails
0.67
IJ
0.66
ËĪ
0.65
£ı
0.64
Activations Density 0.000%
No Known Activations
This feature has no known activations.