INDEX
Explanations
transitions or changes in status or state
New Auto-Interp
Negative Logits
nable
-0.15
itis
-0.15
rench
-0.15
izable
-0.14
umd
-0.14
åī©
-0.14
toujours
-0.14
sebou
-0.14
ẹp
-0.14
oppins
-0.14
POSITIVE LOGITS
part
0.24
increasingly
0.22
aware
0.22
acquainted
0.19
known
0.18
friends
0.18
extinct
0.17
lodged
0.17
involved
0.17
FirstResponder
0.17
Activations Density 0.063%