INDEX
Explanations
phrases related to political or social issues in specific contexts
New Auto-Interp
Negative Logits
JPEG
-0.73
Maced
-0.66
simulated
-0.65
counted
-0.63
misunder
-0.63
decimal
-0.62
mathemat
-0.61
fortun
-0.60
Axel
-0.59
decomp
-0.59
POSITIVE LOGITS
s
0.99
_.
0.92
ski
0.91
tion
0.90
stra
0.88
ï¸ı
0.84
span
0.83
sure
0.82
edi
0.81
him
0.80
Activations Density 0.249%