INDEX
Explanations
past participle verbs
phrases indicating past actions or states of being
New Auto-Interp
Negative Logits
izable
-0.69
now
-0.63
IVE
-0.63
Must
-0.63
buquerque
-0.61
now
-0.61
Not
-0.61
ierrez
-0.60
Not
-0.60
today
-0.60
POSITIVE LOGITS
misled
1.00
tricked
0.97
WARN
0.96
briefed
0.95
spared
0.92
warned
0.92
robbed
0.91
deceived
0.91
hijacked
0.89
expelled
0.89
Activations Density 0.102%