INDEX
Explanations
the word "most" and its variations
New Auto-Interp
Negative Logits
�
-0.84
stice
-0.76
layer
-0.70
iseum
-0.69
wagen
-0.69
urity
-0.64
uler
-0.64
Morse
-0.64
iku
-0.63
agher
-0.63
POSITIVE LOGITS
recent
0.82
likely
0.74
Recent
0.72
Wanted
0.69
probable
0.67
recently
0.66
Popular
0.66
preferably
0.66
often
0.65
prevalent
0.65
Activations Density 0.034%