INDEX
Explanations
future tense verbs
phrases indicating future occurrences or predictions
New Auto-Interp
Negative Logits
buster
-0.67
cius
-0.60
strate
-0.59
pose
-0.59
elector
-0.56
Psychiatry
-0.55
roit
-0.55
maze
-0.55
alloc
-0.55
zee
-0.54
POSITIVE LOGITS
plenty
0.99
lots
0.80
exceptions
0.77
some
0.76
女
0.75
ample
0.74
SOME
0.69
parallels
0.69
estern
0.68
alot
0.68
Activations Density 0.051%