INDEX
Explanations
phrases related to completeness or entirety
instances of the word "whole."
New Auto-Interp
Negative Logits
nings
-0.72
liest
-0.68
intent
-0.68
rf
-0.64
inputs
-0.63
filings
-0.62
bids
-0.61
akers
-0.61
anwhile
-0.60
claimed
-0.60
POSITIVE LOGITS
heartedly
1.14
bunch
0.90
slew
0.88
lot
0.88
hearted
0.86
swath
0.83
dozen
0.82
raft
0.78
thing
0.76
dozen
0.72
Activations Density 0.014%