INDEX
Explanations
the concept of attempting or making an effort
New Auto-Interp
Negative Logits
rip
-0.17
acre
-0.17
pire
-0.16
stru
-0.16
meal
-0.15
trip
-0.15
/off
-0.15
arity
-0.15
ikit
-0.15
wig
-0.14
POSITIVE LOGITS
rowspan
0.18
nghiá»ĩm
0.18
æħĭ
0.16
outs
0.16
andex
0.15
á»ģn
0.15
ogue
0.15
oris
0.14
ssel
0.14
ew
0.14
Activations Density 0.058%