INDEX
Explanations
past tense verbs
the word "was" and its variants indicating past actions or states
New Auto-Interp
Negative Logits
Seym
-0.71
arta
-0.66
stood
-0.62
buckets
-0.57
holders
-0.56
quotas
-0.53
spheres
-0.53
BJ
-0.52
marker
-0.52
Grail
-0.52
POSITIVE LOGITS
wolves
0.99
wolf
0.95
hes
0.82
actionDate
0.75
ps
0.72
osponsors
0.72
20439
0.68
Flavoring
0.64
released
0.64
othing
0.63
Activations Density 0.072%