INDEX
Explanations
re + [verb ending in -ing or -ed]
New Auto-Interp
Negative Logits
for
0.54
isang
0.49
és
0.48
uri
0.46
esh
0.45
스
0.44
𝗿
0.44
so
0.44
ine
0.43
l
0.43
POSITIVE LOGITS
0.48
นำ
0.43
gunakan
0.42
całość
0.39
nCurr
0.39
awarkan
0.39
overtake
0.38
ন
0.38
送到
0.37
kasutada
0.37
Activations Density 1.117%