INDEX
Explanations
important social and political concepts
New Auto-Interp
Negative Logits
å¡ŀ
-0.15
oko
-0.15
TRGL
-0.14
มà¸Ń
-0.14
isphere
-0.13
isÃŃ
-0.13
Bates
-0.13
vsp
-0.13
Shr
-0.13
å¼ı
-0.13
POSITIVE LOGITS
stand
0.95
stands
0.84
Stand
0.80
stand
0.78
stood
0.77
Stand
0.71
stands
0.71
standing
0.70
_stand
0.66
standing
0.61
Activations Density 0.184%