INDEX
Explanations
governmental and regulatory actions
New Auto-Interp
Negative Logits
Avoiding
0.43
Enjoy
0.39
ത്
0.39
প্রস্তাবে
0.38
வண
0.37
कहला
0.36
𝑜
0.35
符合
0.35
来自
0.35
escaping
0.35
POSITIVE LOGITS
deem
0.68
deems
0.67
intervened
0.63
recognizes
0.61
approve
0.58
intervene
0.57
frown
0.57
sanction
0.56
issued
0.55
assign
0.55
Activations Density 0.025%