INDEX
Explanations
phrases indicating cooperation or collaboration
New Auto-Interp
Negative Logits
ista
-0.71
umbo
-0.69
ble
-0.67
geons
-0.64
geon
-0.64
hyde
-0.63
unc
-0.63
ortal
-0.62
agnar
-0.62
ysis
-0.61
POSITIVE LOGITS
ness
0.85
NESS
0.74
ãĤ¤ãĥĪ
0.72
nesses
0.71
together
0.66
isphere
0.66
arity
0.65
Kurd
0.63
Together
0.63
footed
0.63
Activations Density 0.027%