INDEX
Explanations
phrases indicating a comparison or contrast in a situation
the phrase "no matter how" followed by varying conditions or qualities
New Auto-Interp
Negative Logits
uterte
-0.79
ridor
-0.75
anca
-0.75
OL
-0.72
TABLE
-0.71
Transfer
-0.70
emale
-0.69
aters
-0.69
OW
-0.69
ATES
-0.68
POSITIVE LOGITS
messed
0.75
fanc
0.75
faults
0.71
exalted
0.70
fancy
0.69
crappy
0.66
slight
0.66
coerc
0.66
preached
0.65
valiant
0.65
Activations Density 0.057%