INDEX
Explanations
expressions related to intention or intent
New Auto-Interp
Negative Logits
cours
-0.47
inti
-0.45
дри
-0.45
Pinch
-0.44
outs
-0.44
CCS
-0.43
cocks
-0.43
hydro
-0.43
APT
-0.43
ACA
-0.42
POSITIVE LOGITS
intention
0.71
intención
0.64
intenção
0.60
Intent
0.58
intenciones
0.57
Intended
0.56
transQ
0.56
Roskov
0.56
RectangleBorder
0.56
intent
0.55
Activations Density 0.322%