INDEX
Explanations
words related to scientific research and medical development
New Auto-Interp
Negative Logits
wierd
-0.86
somebody
-0.83
Надо
-0.83
Somebody
-0.79
stuff
-0.77
colère
-0.77
stupid
-0.77
Jefus
-0.76
honte
-0.76
Somebody
-0.76
POSITIVE LOGITS
welcher
0.77
אשר
0.74
prior
0.70
uxta
0.70
welchem
0.69
showcased
0.69
utilizing
0.67
advancements
0.66
showcasing
0.64
唯有
0.64
Activations Density 2.146%