INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
purposely
0.37
TRY
0.35
properly
0.34
blatantly
0.33
actually
0.32
beginning
0.32
başlam
0.32
뿅
0.31
ärke
0.31
居然
0.31
POSITIVE LOGITS
regard
0.36
down
0.35
foreground
0.31
Technik
0.31
consider
0.30
consider
0.30
deem
0.30
fores
0.30
Foreground
0.30
subordinate
0.29
Activations Density 0.000%
No Known Activations
This feature has no known activations.