INDEX
Explanations
phrases asking about or referring to a specific method or approach
phrases indicating a particular manner or approach to situations
New Auto-Interp
Negative Logits
usters
-0.77
livest
-0.75
uster
-0.73
sugg
-0.68
urated
-0.67
ividual
-0.64
nect
-0.63
urate
-0.63
ynski
-0.62
cryst
-0.62
POSITIVE LOGITS
fare
1.19
ward
1.09
finding
1.07
WARD
0.92
bill
0.84
point
0.84
forward
0.83
finder
0.81
points
0.77
Jet
0.73
Activations Density 0.048%