INDEX
Explanations
instances of the word "collaborate" and its variations
New Auto-Interp
Negative Logits
ÃŃch
-0.18
eba
-0.17
umas
-0.16
resas
-0.15
isbury
-0.15
nila
-0.14
rus
-0.14
ldre
-0.14
Marino
-0.14
unken
-0.14
POSITIVE LOGITS
ative
0.36
atively
0.28
ators
0.25
atory
0.25
ator
0.23
ativ
0.23
ATIVE
0.22
er
0.21
ate
0.21
abor
0.21
Activations Density 0.006%