INDEX
Explanations
city, visa, meal, trails, friend, day
New Auto-Interp
Negative Logits
fizemos
0.46
float
0.45
tomto
0.44
killed
0.44
makes
0.43
dedo
0.43
AMPL
0.43
voted
0.43
ezzel
0.42
deplete
0.42
POSITIVE LOGITS
lø
0.53
F
0.48
trab
0.48
im
0.48
임
0.47
इज
0.46
நே
0.45
イ
0.45
iczne
0.45
就业
0.44
Activations Density 0.004%