INDEX
Explanations
prepositions indicating direction or action
prepositions and phrases indicating relationships or directions
New Auto-Interp
Negative Logits
olulu
-0.88
daq
-0.78
hirt
-0.73
fab
-0.70
ashtra
-0.68
illac
-0.67
cyclopedia
-0.67
alys
-0.67
nick
-0.66
robe
-0.66
POSITIVE LOGITS
tnc
0.72
PW
0.72
LW
0.71
onset
0.71
effic
0.71
âī¤
0.64
GW
0.64
nonex
0.63
efficiency
0.63
lieu
0.62
Activations Density 0.476%