INDEX
Explanations
discussions about social issues, specifically focusing on health, inequality, and structural challenges facing communities
New Auto-Interp
Negative Logits
سع
-0.50
sáng
-0.49
memu
-0.48
aclar
-0.48
záz
-0.47
интер
-0.47
jans
-0.46
LLA
-0.46
ErrInvalid
-0.46
balle
-0.45
POSITIVE LOGITS
Worse
0.91
Worse
0.86
worse
0.85
unknownFields
0.78
worst
0.77
unchecked
0.77
worsened
0.76
rampant
0.76
worsen
0.74
worse
0.73
Activations Density 0.655%