INDEX
Explanations
sequential actions after 'and'
New Auto-Interp
Negative Logits
好き
0.42
የሚ
0.41
کریں۔
0.41
accessing
0.39
смотрите
0.39
жете
0.38
іб
0.38
रखने
0.38
entail
0.38
anschauen
0.37
POSITIVE LOGITS
whose
0.64
swore
0.59
prepares
0.56
manages
0.55
engages
0.55
employs
0.54
spends
0.54
expects
0.53
had
0.53
although
0.52
Activations Density 0.051%