INDEX
Explanations
terms related to restriction or reduction
New Auto-Interp
Negative Logits
rup
-0.70
SourceFile
-0.69
Done
-0.66
grave
-0.66
bill
-0.65
ANK
-0.64
pty
-0.64
��
-0.64
kson
-0.63
ifferent
-0.62
POSITIVE LOGITS
functionality
0.74
visibility
0.73
boundaries
0.70
influence
0.67
prominence
0.67
capabilities
0.66
opportunities
0.65
importance
0.65
participation
0.64
recovery
0.63
Activations Density 0.117%