INDEX
Explanations
mentions of a significant distance or extent
phrases indicating a sense of ongoing process or status
New Auto-Interp
Negative Logits
cca
-0.67
vec
-0.63
murd
-0.60
driving
-0.59
NetMessage
-0.58
uras
-0.58
rium
-0.55
ums
-0.55
bec
-0.55
pron
-0.55
POSITIVE LOGITS
nobody
0.89
none
0.88
hasn
0.82
unsuccessful
0.82
haven
0.74
there
0.74
nothing
0.74
indications
0.73
neither
0.70
only
0.69
Activations Density 0.043%