INDEX
Explanations
the word "loyal" with a high activation
the concept of loyalty
New Auto-Interp
Negative Logits
EVA
-0.74
OUT
-0.71
Virus
-0.67
Autism
-0.66
Drugs
-0.65
Schwarz
-0.65
Genetics
-0.65
FER
-0.64
Clinic
-0.64
LOD
-0.64
POSITIVE LOGITS
loyal
1.06
itiz
1.05
allegiance
0.95
loyalty
0.93
alty
0.91
ties
0.91
atile
0.88
adherent
0.88
iciary
0.87
servant
0.85
Activations Density 0.007%