INDEX
Explanations
recommendations or instructions in a document
phrases indicating ideal scenarios or recommendations
New Auto-Interp
Negative Logits
addictive
-0.71
itous
-0.70
AIDS
-0.70
dare
-0.66
injustice
-0.66
menace
-0.65
Traff
-0.65
abuse
-0.63
threatens
-0.62
wonders
-0.61
POSITIVE LOGITS
preferred
0.84
preferring
0.80
Preferred
0.78
FontSize
0.73
uci
0.72
prefer
0.72
ideally
0.72
medium
0.72
preferably
0.70
ittal
0.70
Activations Density 0.676%