INDEX
Explanations
instances of the word "alike" and its variations
New Auto-Interp
Negative Logits
phere
-0.19
flare
-0.18
UIBar
-0.17
uito
-0.16
UIT
-0.15
ůl
-0.15
bles
-0.15
jem
-0.15
urge
-0.14
mite
-0.14
POSITIVE LOGITS
alike
0.16
ewan
0.15
kowski
0.15
iid
0.14
isans
0.14
ingo
0.14
Chi
0.13
elden
0.13
aje
0.13
zeitig
0.13
Activations Density 0.010%