INDEX
Explanations
references to locations or states of being in various contexts
New Auto-Interp
Negative Logits
hung
-0.49
[*]
-0.45
adalajara
-0.45
Kunt
-0.44
/*
-0.44
memoized
-0.43
uebe
-0.43
tip
-0.42
ggere
-0.42
overe
-0.42
POSITIVE LOGITS
sich
1.44
się
1.24
zich
1.16
themselves
1.05
himself
1.04
yourself
0.99
yourselves
0.98
itself
0.98
herself
0.97
themselves
0.96
Activations Density 0.031%