INDEX
Explanations
phrases related to medical conditions and treatments, including surgical procedures
references to illness, injury, and the subsequent reactions or events surrounding them
New Auto-Interp
Negative Logits
yip
-0.83
rall
-0.77
Jordan
-0.77
oni
-0.77
Jordanian
-0.76
Jill
-0.75
Panda
-0.74
Khalid
-0.73
rh
-0.71
NW
-0.71
POSITIVE LOGITS
Cast
2.48
Cast
2.21
cast
1.67
cast
1.67
casts
1.61
casting
1.60
Casting
1.56
Castle
1.48
CAST
1.48
casting
1.41
Activations Density 0.222%