INDEX
Explanations
the abbreviation "Vo" followed by a number
New Auto-Interp
Negative Logits
dress
-0.70
foundations
-0.64
baugh
-0.63
showers
-0.63
iPads
-0.61
pillar
-0.61
taker
-0.60
Scientist
-0.60
Reloaded
-0.58
Journals
-0.58
POSITIVE LOGITS
VO
1.03
iced
0.97
zzo
0.96
irus
0.96
vernment
0.95
IP
0.94
Vo
0.94
icer
0.92
ile
0.92
ascular
0.90
Activations Density 0.014%