INDEX
Explanations
describing states or qualities
New Auto-Interp
Negative Logits
↵
0.18
)
0.17
នូវ
0.17
belonging
0.16
of
0.16
along
0.15
'
0.15
skulle
0.15
Esses
0.15
arbeid
0.15
POSITIVE LOGITS
notoriously
0.25
где
0.22
famously
0.22
HUGE
0.21
unfortunately
0.20
حيث
0.20
notorious
0.20
actually
0.20
admittedly
0.20
এখন
0.19
Activations Density 0.178%