INDEX
Explanations
positive and negative attributes or emotions
New Auto-Interp
Negative Logits
Jere
-0.64
ares
-0.64
Phones
-0.62
OWS
-0.62
possessions
-0.61
uts
-0.59
would
-0.59
ints
-0.59
ghazi
-0.58
icles
-0.57
POSITIVE LOGITS
overlap
0.84
waiting
0.84
difference
0.84
lurking
0.84
underway
0.83
unanim
0.82
shortage
0.81
inherent
0.79
discrepancy
0.79
mismatch
0.77
Activations Density 0.303%