INDEX
Explanations
expressions of genuine admiration or authenticity
New Auto-Interp
Negative Logits
rous
-0.15
sel
-0.15
Briggs
-0.15
ses
-0.15
.au
-0.15
mere
-0.14
lot
-0.14
strand
-0.14
angan
-0.14
ains
-0.14
POSITIVE LOGITS
/false
0.28
-blue
0.21
-life
0.21
ignment
0.19
fully
0.17
truly
0.15
born
0.15
474
0.15
-ÑĤаки
0.15
ajan
0.14
Activations Density 0.022%