INDEX
Explanations
instances of the pronoun "we"
New Auto-Interp
Negative Logits
quia
-0.16
mia
-0.15
italia
-0.15
_IA
-0.15
íĥĿ
-0.15
оба
-0.15
657
-0.15
ieten
-0.15
lia
-0.14
enties
-0.14
POSITIVE LOGITS
learn
0.27
meet
0.25
flash
0.24
fast
0.23
learns
0.23
learned
0.20
Meet
0.20
learn
0.19
find
0.19
Learn
0.19
Activations Density 0.054%