INDEX
Explanations
phrases related to people's behaviors and activities
New Auto-Interp
Negative Logits
sticks
-0.79
ynes
-0.75
stick
-0.69
fighters
-0.66
ilver
-0.65
Peaks
-0.64
roma
-0.63
hemat
-0.62
aic
-0.61
sworth
-0.61
POSITIVE LOGITS
inyl
0.91
obtaining
0.83
educating
0.82
collecting
0.81
unlocking
0.81
overcoming
0.80
constructing
0.79
reforming
0.79
achieving
0.78
organising
0.78
Activations Density 0.023%