INDEX
Explanations
mentions of personal experiences and professional accomplishments
New Auto-Interp
Negative Logits
eers
-0.59
netflix
-0.57
river
-0.55
Uriel
-0.53
hyde
-0.53
ibaba
-0.52
phans
-0.50
ÑĮ
-0.50
Tree
-0.50
ahead
-0.49
POSITIVE LOGITS
own
1.10
intentions
0.85
predicament
0.82
involvement
0.80
inability
0.80
accomplishments
0.79
efforts
0.79
plight
0.79
newfound
0.78
actions
0.76
Activations Density 14.996%