INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
don
-0.76
çİĭ
-0.76
prototype
-0.73
Construct
-0.72
bending
-0.71
CONCLUS
-0.69
tests
-0.68
princip
-0.67
Amazing
-0.67
scrib
-0.65
POSITIVE LOGITS
Rouge
0.73
FN
0.70
IPM
0.68
Exit
0.67
Bruins
0.65
LW
0.64
CNS
0.62
LH
0.62
Lans
0.61
Polo
0.61
Activations Density 0.000%
No Known Activations
This feature has no known activations.