INDEX
Explanations
discussions about knowledge, experience, and collaboration among individuals
New Auto-Interp
Negative Logits
ãĥ©ãĥĥãĤ¯
-0.17
aven
-0.15
iker
-0.14
chamber
-0.14
off
-0.14
proportional
-0.14
emoc
-0.14
Hit
-0.14
atcher
-0.14
agree
-0.14
POSITIVE LOGITS
superior
0.23
Superior
0.19
ahead
0.19
whereas
0.19
envy
0.18
superiority
0.17
Ahead
0.17
ahead
0.16
Whereas
0.16
-ahead
0.15
Activations Density 0.208%