INDEX
Explanations
comparisons or contrasts between different entities or situations
New Auto-Interp
Negative Logits
yst
-0.15
γα
-0.14
hai
-0.13
849
-0.13
AYS
-0.13
ron
-0.13
ãĥ³ãĥĩ
-0.13
810
-0.13
ury
-0.13
due
-0.13
POSITIVE LOGITS
what
0.20
those
0.19
other
0.18
others
0.17
other
0.16
enthal
0.16
idebar
0.16
'autres
0.15
ones
0.15
liga
0.15
Activations Density 0.062%