INDEX
Explanations
phrases that indicate sequential or subsequent actions or events
New Auto-Interp
Negative Logits
Yag
-0.81
Eisenberg
-0.75
Veld
-0.74
McCarty
-0.72
Ramesh
-0.71
Haram
-0.70
Kos
-0.70
Pemberton
-0.69
oldValue
-0.68
חיצוניים
-0.68
POSITIVE LOGITS
following
1.87
following
1.83
follow
1.80
follow
1.79
Following
1.79
Following
1.76
Follows
1.73
Follow
1.65
FOLLOW
1.64
follows
1.63
Activations Density 0.111%