INDEX
Explanations
places or instances where the word "note" is used
phrases indicating conditionality or the presence of specific criteria
New Auto-Interp
Negative Logits
endez
-0.53
qt
-0.48
anger
-0.47
Els
-0.46
unal
-0.45
anamo
-0.45
pler
-0.43
venge
-0.42
emies
-0.41
opian
-0.41
POSITIVE LOGITS
comprom
0.47
overriding
0.44
percentage
0.42
totality
0.42
construed
0.40
fortun
0.40
temptation
0.39
occurrences
0.39
"%
0.39
prolonged
0.39
Activations Density 0.894%