INDEX
Explanations
phrases indicating a requirement or guarantee
New Auto-Interp
Negative Logits
Notae
-0.63
fVar
-0.62
Brat
-0.61
wę
-0.60
talk
-0.60
Roberta
-0.58
tas
-0.58
Lived
-0.58
tdb
-0.57
fritas
-0.57
POSITIVE LOGITS
ensure
2.77
ensures
2.60
ensured
2.58
ensuring
2.58
Ensure
2.53
Ensure
2.45
Ensuring
2.25
ensure
2.23
确保
1.75
insure
1.75
Activations Density 0.059%