INDEX
Explanations
prepositions indicating a source or origin
phrases indicating demands or expectations
New Auto-Interp
Negative Logits
idi
-0.77
itialized
-0.72
acet
-0.71
irm
-0.71
flix
-0.71
motion
-0.71
je
-0.71
iken
-0.69
unes
-0.69
puter
-0.68
POSITIVE LOGITS
afar
1.39
abroad
0.91
whence
0.86
inside
0.85
us
0.85
anywhere
0.85
anyone
0.83
him
0.82
anybody
0.80
within
0.77
Activations Density 0.105%