INDEX
Explanations
phrases related to encouragement and support
New Auto-Interp
Negative Logits
id
-0.70
as
-0.66
t
-0.63
io
-0.62
i
-0.61
Kar
-0.61
Dios
-0.59
n
-0.59
off
-0.59
lands
-0.58
POSITIVE LOGITS
encouraged
1.89
encourage
1.88
Encourage
1.86
encourages
1.85
encouragement
1.82
couraged
1.79
Encourage
1.78
couraging
1.66
encouraging
1.65
encouragement
1.62
Activations Density 0.135%