INDEX
Explanations
verbs related to causing influence or change
phrases involving the concept of bringing or resulting in something
New Auto-Interp
Negative Logits
uary
-0.73
livious
-0.71
dating
-0.69
schild
-0.68
debian
-0.67
taker
-0.66
imon
-0.66
raid
-0.64
unker
-0.63
rant
-0.61
POSITIVE LOGITS
forth
1.25
together
0.92
attention
0.85
forward
0.83
smiles
0.77
up
0.75
hurst
0.72
down
0.70
awareness
0.70
unity
0.70
Activations Density 0.040%