INDEX
Explanations
adjectives related to emotions and opinions
New Auto-Interp
Negative Logits
arta
-0.81
ologies
-0.76
VIDEOS
-0.74
=-=-=-=-=-=-=-=-
-0.74
mare
-0.70
Dialogue
-0.69
newsletters
-0.68
Nation
-0.68
sucks
-0.66
=================================
-0.65
POSITIVE LOGITS
unsuccessful
1.06
instrumental
1.01
initially
0.96
originally
0.95
conceived
0.94
successful
0.93
able
0.84
born
0.83
intended
0.82
unexpected
0.81
Activations Density 0.371%