INDEX
Explanations
references to the word "Med" or terms associated with medicine and medical entities
New Auto-Interp
Negative Logits
omat
-0.17
_mD
-0.16
_mE
-0.15
itize
-0.15
atak
-0.15
291
-0.15
Ñĩим
-0.14
itious
-0.14
_mB
-0.14
Äįit
-0.14
POSITIVE LOGITS
ved
0.28
aille
0.21
usa
0.20
Med
0.20
ford
0.18
ically
0.17
onald
0.17
alla
0.17
upe
0.17
ela
0.17
Activations Density 0.005%