INDEX
Explanations
take ownership or advantage
New Auto-Interp
Negative Logits
Stop
0.74
aryng
0.74
STOP
0.73
oglob
0.73
нё
0.69
記載
0.69
způso
0.69
dail
0.69
tweet
0.69
Calend
0.68
POSITIVE LOGITS
advantage
2.46
advantage
1.81
care
1.68
ventaja
1.60
avantage
1.53
pride
1.48
Advantage
1.45
advantages
1.43
precautions
1.42
liberties
1.41
Activations Density 0.063%