INDEX
Explanations
verbs indicating a change of understanding or realization
expressions of realization or understanding
New Auto-Interp
Negative Logits
nick
-0.69
behind
-0.67
States
-0.67
front
-0.64
Julius
-0.63
adr
-0.63
Palest
-0.63
Newsletter
-0.63
Britain
-0.62
root
-0.62
POSITIVE LOGITS
Been
0.85
ãĤ¦ãĤ¹
0.68
°
0.66
plenty
0.65
ãĥĨ
0.64
gone
0.64
Gone
0.63
Tro
0.62
Plenty
0.60
cedented
0.60
Activations Density 0.102%