INDEX
Explanations
words that convey strong opinions or significant impacts
degree or magnitude
New Auto-Interp
Negative Logits
الحياه
-0.60
⟬
-0.58
ScopeManager
-0.55
SequentialGroup
-0.55
transfieras
-0.54
MemoryWarning
-0.54
kaarangay
-0.54
:✨
-0.54
Portail
-0.53
rungsseite
-0.53
POSITIVE LOGITS
,
0.47
enough
0.45
and
0.44
but
0.40
wide
0.38
;
0.38
crí
0.38
:
0.38
far
0.38
only
0.37
Activations Density 0.045%