INDEX
Explanations
sentences that indicate conclusions or results
New Auto-Interp
Negative Logits
listi
-0.66
NICK
-0.65
ack
-0.61
Drill
-0.61
nick
-0.60
uck
-0.60
masking
-0.60
Kuli
-0.60
Robin
-0.60
Duck
-0.60
POSITIVE LOGITS
Therefore
1.55
Therefore
1.52
therefore
1.26
Поэтому
1.21
Portanto
1.20
therefore
1.18
Поэтому
1.12
derfor
1.06
Derfor
1.02
Daarom
1.02
Activations Density 0.022%