INDEX
Explanations
phrases indicating processes or methods of obtaining results
obtained by [action]
New Auto-Interp
Negative Logits
transfieras
-0.46
ennemi
-0.42
niega
-0.40
-0.39
fromnode
-0.38
GEBURTS
-0.37
zieken
-0.37
новниш
-0.37
Civilian
-0.37
curities
-0.36
POSITIVE LOGITS
Cyfarwyddwr
0.52
isMethod
0.51
process
0.48
use
0.47
҉
0.47
transform
0.47
direct
0.46
repeated
0.45
systematically
0.45
repeat
0.45
Activations Density 0.063%