INDEX
Explanations
sentences that express opinions or evaluations of ideas or creative works
after nouns
concepts and positions
New Auto-Interp
Negative Logits
-0.69
تضيفلها
-0.67
ATTN
-0.59
endpush
-0.59
ThroughAttribute
-0.57
selaku
-0.56
érité
-0.55
onOptions
-0.55
NoSuch
-0.54
gnon
-0.52
POSITIVE LOGITS
indeed
0.88
indeed
0.72
overall
0.67
nonetheless
0.64
Indeed
0.62
Overall
0.59
considering
0.59
Indeed
0.59
deserving
0.58
deserves
0.58
Activations Density 0.240%