INDEX
Explanations
arguments or propositions related to plans and suggestions
imagining hypothetical situations
New Auto-Interp
Negative Logits
السكان
-0.43
zwykle
-0.43
Trotzdem
-0.42
Dziękuję
-0.40
相変わらず
-0.39
çoğu
-0.39
glück
-0.38
sobrevivir
-0.38
verhält
-0.38
thankfully
-0.38
POSITIVE LOGITS
imagine
0.89
Imagine
0.87
imagine
0.82
Imagine
0.79
Wouldn
0.75
Wouldn
0.75
imagin
0.72
idea
0.70
would
0.68
imagining
0.68
Activations Density 0.048%