INDEX
Explanations
phrases related to events or actions coming to an end
phrases indicating closure or endings
New Auto-Interp
Negative Logits
nda
-0.65
candid
-0.64
express
-0.60
prizes
-0.60
anges
-0.59
labels
-0.58
leading
-0.58
knowledgeable
-0.57
Regions
-0.57
reason
-0.57
POSITIVE LOGITS
lately
0.76
bos
0.73
amid
0.73
halfway
0.70
ç
0.68
Ñĭ
0.68
ãĥ¼ãĥĨ
0.66
IER
0.66
ÃĽ
0.65
midway
0.63
Activations Density 0.198%