INDEX
Explanations
instances of the word "admit" and its variations, highlighting themes of acknowledgment and confession
New Auto-Interp
Negative Logits
olding
-0.15
ìĶ
-0.15
blo
-0.15
Niet
-0.15
olds
-0.14
pj
-0.14
.shutdown
-0.14
fu
-0.14
ież
-0.14
lei
-0.13
POSITIVE LOGITS
defeat
0.30
defeats
0.18
freely
0.18
defeated
0.18
ration
0.17
ting
0.16
thumb
0.16
to
0.16
feeling
0.15
responsibility
0.15
Activations Density 0.023%