INDEX
Explanations
phrases indicating a departure or end of something
directional or positional terms related to movement and location
New Auto-Interp
Negative Logits
Vaugh
-0.67
Palestin
-0.66
athan
-0.59
Kahn
-0.57
Naz
-0.56
Nev
-0.55
Nare
-0.54
Gilbert
-0.53
Azerb
-0.51
Holl
-0.51
POSITIVE LOGITS
][
0.83
lishes
0.74
+=
0.71
tics
0.69
:=
0.68
redients
0.67
Joined
0.66
lined
0.64
malink
0.64
cro
0.64
Activations Density 0.080%