INDEX
Explanations
phrases indicating reasons or explanations
references to sections or parts of a larger text or context
New Auto-Interp
Negative Logits
mins
-0.68
marine
-0.65
slightest
-0.61
cannabin
-0.60
clipboard
-0.59
Express
-0.58
lean
-0.58
Commissioners
-0.57
NEXT
-0.56
Express
-0.56
POSITIVE LOGITS
thanks
0.85
because
0.80
due
0.76
owing
0.70
because
0.70
atos
0.70
laughter
0.67
aughter
0.65
attributable
0.63
icularly
0.63
Activations Density 0.041%