INDEX
Explanations
phrases indicating exclusivity or limitation
phrases indicating exclusivity or restriction
New Auto-Interp
Negative Logits
insula
-0.64
idon
-0.64
Malf
-0.62
arted
-0.61
cent
-0.61
Nev
-0.59
Massive
-0.59
put
-0.57
--------------------------------------------------------
-0.57
charism
-0.57
POSITIVE LOGITS
marginally
0.98
ICES
0.81
ices
0.80
kidding
0.78
incidentally
0.77
spor
0.72
isons
0.69
briefly
0.67
mildly
0.67
subset
0.66
Activations Density 0.056%