INDEX
Explanations
conversations about community engagement and discussions within group contexts
New Auto-Interp
Negative Logits
réguli
-0.83
élevées
-0.65
démocr
-0.65
regulares
-0.63
complètes
-0.61
tranquille
-0.58
nacer
-0.57
flesta
-0.57
cortas
-0.57
brancas
-0.56
POSITIVE LOGITS
how
1.39
why
1.29
whether
1.22
如何
1.07
what
0.99
possible
0.97
matters
0.95
issues
0.94
why
0.93
whether
0.93
Activations Density 0.447%