INDEX
Explanations
phrases that indicate conditions or premises for reasoning or conclusions
New Auto-Interp
Negative Logits
displacement
-0.73
Displacement
-0.71
Ginn
-0.71
Waller
-0.70
displacement
-0.70
Walters
-0.70
ి
-0.68
gug
-0.67
textView
-0.67
Flanagan
-0.66
POSITIVE LOGITS
Based
1.38
Based
1.36
based
1.35
BASED
1.35
based
1.34
BASED
1.27
basé
1.17
basée
1.08
basado
1.03
bases
1.02
Activations Density 0.087%