INDEX
Explanations
phrases related to actions taken or experiences had
phrases indicating expectations or past experiences
New Auto-Interp
Negative Logits
States
-0.70
Britain
-0.69
nick
-0.66
behind
-0.64
Julius
-0.60
Present
-0.60
Eastern
-0.59
Roman
-0.58
Picture
-0.57
Russ
-0.57
POSITIVE LOGITS
ãĥĨ
0.72
wcsstore
0.69
rontal
0.69
°
0.67
Been
0.65
ãĤ¦ãĤ¹
0.65
%:
0.61
CLASSIFIED
0.61
ansson
0.59
plenty
0.59
Activations Density 0.097%