INDEX
Explanations
technical or specialized terms in various fields such as medicine, science, architecture, and activism
topics related to various forms of media, achievements, and social issues
New Auto-Interp
Negative Logits
steps
-0.91
Ĥİ
-0.87
fixes
-0.83
changes
-0.81
ACTIONS
-0.81
icides
-0.79
tics
-0.78
rams
-0.78
azes
-0.78
amples
-0.77
POSITIVE LOGITS
resemblance
1.13
pedigree
1.11
knack
1.09
tendency
1.04
reputation
1.02
rating
1.01
mentality
1.01
penchant
0.99
fetish
0.95
nickname
0.95
Activations Density 0.347%