INDEX
Explanations
first, important, note, before
New Auto-Interp
Negative Logits
opposit
0.48
niezb
0.44
negated
0.41
respectivement
0.39
少なくとも
0.39
exces
0.38
あとは
0.38
ৃতা
0.37
ተመሳሳይ
0.37
alternatively
0.36
POSITIVE LOGITS
前提
0.81
首先
0.78
NOTE
0.76
ก่อน
0.76
Abbreviations
0.75
Before
0.74
caveat
0.74
caveats
0.74
NOTE
0.73
Disclaimer
0.73
Activations Density 0.024%